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NUMERICAL  METHODS  IN  CONTINUUM  MECHANICS 


FINAL  TECHNICAL  REPORT 

1.  STATEMENT  OF  THE  PROBLEMS  STUDIED 

Five  different  problems  are  studied  in  this  project.  They  are  stated 
briefly  below: 

(a)  HETEROGENEOUS  DIFFUSION  AND  THE  COMPACT  VOLUME  METHODS 

In  this  paper  we  study  heterogeneous  diffusion  in  the  context  of 

heat  conduction  in  laminated  material,  heat  conduction  with  phase  change 
(the  Stefan  problem)  and  fluid  flow  in  porous  media  (Richards' 

equation).  We  discretize  these  problems  via  the  compact  volume  method 
(CVM).  We  illustrate  that  this  method  is  higher  order  accurate  in  the 
flux  error  than  traditional  methods.  The  resulting  nonlinear  algebraic 
systems  are  approximated  by  different  versions  of  the  SOR  and  ADI 

schemes.  Vector  and  multiprocessing  implementations  were  presented, 
which  indicated  a  suitability  of  these  methods  for  high  performance 
computing. 

(b)  INDUCED  CONTRACTIVE  METHODS  FOR  FREE  BVPs 

In  this  paper  we  consider  numerical  schemes  for  the  solution  of 

nonlinear  algebraic  systems  which  evolve  from  the  discretization  of  the 
Stefan  problem,  and  from  the  fluid  flow  in  a  porous  media  problem, 
Richards'  equation.  We  presented  analysis  of  induced  contractive  methods 
for  free  boundary  value  problems.  Both  the  ADI  and  SOR  versions 
vectorize  very  well.  In  3D  problems  with  complicated  nonlinear  terms  we 
expect  the  ADI  version  to  be  more  robust  and  to  perform  equally  as  welt 
as  with  the  SOR  version. 

(c)  NUMERICAL  SOLUTION  OF  FLUID  FLOW  IN  PARTIALLY  SATURATED 
POROUS  MEDIA 

This  paper  describes  an  SOR  algorithm  for  solving  the  nonlinear 
algebraic  system  which  evolves  from  Richards'  equation  that  models  fluid 
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flow  in  a  porous  media.  The  moisture  content  and  hydraulic  conductivity 
functions  are  approximated  by  piecewise  linear  functions  obtained  from 
field  data.  The  resulting  algebraic  system  is  solved  by  a  variation  of  the 
nonlinear  SOR  algorithm.  The  advantage  of  this  approach  is  that  it  avoids 
some  of  the  numerical  oscillations  associated  with  large  derivatives  in 
the  data.  Numerical  calculations  are  presented  and  illustrate  the 
following:  (i)  agreement  of  the  numerical  model  with  observed  data, 

(ii)  dependence  and  comparison  results  as  a  function  of  uncertain  data, 
and  (iii)  suitability  of  these  algorithms  for  multiprocessing 
computations  via  domain  decomposition  methods.  Extension  of  these 
algorithms  to  heterogeneous  porous  media  fluid  flow  are  discussed. 

(d)  A  COMPACT  FINITE  VOLUME  SCHEME  FOR  2-D  STEFAN  PROBLEMS  AND 
VECTOR/MULTIPROCESSOR  COMPUTERS 

We  consider  both  the  compact  finite  volume  and  finite  difference 
space  discretizations  of  the  Stefan  problem.  The  resulting  algebraic 
systems  are  solved  by  nonlinear  versions  of  ADI  and  SOR.  Both  algorithms 
contain  significant  parallelism  which  is  demonstrated  on  two 
vector/multiprocessing  computers,  the  Alliant  FX/40  and  the  Cray  Y-MP. 
Numerical  experiments  indicate  that  the  compact  discretization  and  ADI 
give  the  best  accuracy  with  the  minimum  computational  cost. 

(e)  A  COMPARATIVE  STUDY  OF  COMPACT  FINITE  VOLUME  METHODS  FOR 
THE  2-D  AND  3-D  DIFFUSION  EQUATIONS  WITH  FINITE  DIFFERENCE  ADI 
AND  SOR 

Recently  developed  compact  finite  difference  scheme  (CPT)  were 
applied  to  two  and  three  dimensional  diffusion  equations.  The  relative 
merits  of  CPT-ADI  were  investigated  with  other  computational  schemes 
such  as  finite  difference-ADI  and  FDM-SOR.  The  numerical  results 
obtained  from  these  three  approaches  are  compared  to  known  analytical 
solutions.  According  to  our  results  CPT-ADI  was  found  to  be  a  superior 
scheme  with  regard  to  accuracy  and  speed. 
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2.  SUMMARY  OF  THE  MOST  IMPORTANT  RESULTS:  THREE  MOST 
IMPORTANT  RESULTS  ARE  GIVEN  BELOW: 

(i)  We  have  given  an  analysis  of  induced  contractive  methods  for  free 
boundary  value  problems.  The  SOR  version  and  the  ADI  version  of 
Algorithm  seemed  to  perform  ep  taliy  well.  However,  the  SOR  version  was 
very  sensitive  to  the  choice  of  SOR  parameter,  which  contrasts  with  the 
ADI  method  which  was  observed  to  be  fairly  robust  with  respect  to  its 
acceleration  parameters. 

Both  ADI  and  SOR  versions  vectorize  very  well,  and  as  expected,  in 
3D  problems  with  complicated  nonlinear  terms  the  ADI  version  was  found 
to  be  more  robust  but  perform  equally  as  well  as  with  the  SOR  version. 

(ii)  We  have  shown  for  a  number  of  heterogeneous  diffusion  problems 
that  compact  volume  discretization  method  gives  higher  order  error 
estimates  for  both  function  and  flux  errors.  Moreover,  we  have 
illustrated  that  both  the  ADI  and  SOR  algorithms  can  be  adapted  to  solve 
the  resulting  nonlinear  algebraic  systems.  Vectorlzation  and 
multiprocessing  computers  were  shown  to  be  effective  in  executing  these 
algorithms. 

In  the  applications  to  the  heat  conduction  in  a  laminate  materials, 
the  Stefan  problem,  and  Richard's  equation,  we  have  indicated  how  one  can 
extend  the  compact  volume  method  to  2D  and  3D  space  problems.  All  these 
problems  are  of  current  interest  to  physical  scientists  and  engineers. 
Moreover,  the  mathematical  analysis  of  convergence  problem  using  the 
more  complicated  versions  of  CVM  is  currently  needed. 

(iii)  The  Richard's  equation  is  approximated  by  the  finite  difference 
method,  and  the  empirical  data  for  the  moisture  content  and  the  hydraulic 
conductivity  were  approximated  by  pieecewise  linear  functions.  The 
resulting  non-linear  algebraic  system  is  solved  by  a  variation  of  non¬ 
linear  SOR  iterative  method. 

Good  convergence  properties  are  observed  for  three  types  of 
calculations  which  were  chosen  to  demonstrate  the  feasibility  of  realistic 
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numerical  simulations  using  the  SOR  iterative  methods.  These  includes  an 
accurate  simulation  of  fluid  flow  in  Brindabella  Loam,  a  sensitivity 
analysis  of  the  computed  solution  upon  the  empirical  data,  and  the  use  of 
multiprocessing  computers  via  domain  decomposition  methods. 

We  expect  the  methods  of  this  paper  to  generalize  via  the  compact 
volume  method  to  the  more  complicated  heterogeneous  case. 
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of  Scientific  Computing. 
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ATTACHMENT  #1  Induced  Contractive  Methods  for  the  Free 
Boundary  Value  Problems 
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Abstract.  In  this  paper  we  consider  numerical  schemes  for  the  solution  of 
nonlinear  algebraic  systems  which  evolve  from  the  discretization  of  the  Stefan 
problem,  and  from  the  fluid  flow  in  a  porous  media  problem,  Richards'  equation. 
Both  problems  generate  systems  of  the  form  E  +  At  A  (1(E)  =  d  where  (1(E)  is  continuous 
and  nondecreasing.  In  the  Stefan  problem  A  will  be  a  constant  symmetric  matrix. 
For  Richards'  equation  A  will,  in  general,  be  nonconstant  and  nonsymmetric. 
However,  in  both  cases  A  is  an  M-matrix.  We  consider  nonlinear  SOR  and  nonlinear 
ADI  schemes  which  may  be  implemented  on  vector/multiprocessing  computers. 


Subject  Classifications.  Primary  6SH10;  Secondary  76S0S,  76T0S. 
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§1.  Introduction.  The  primary  objective  of  this  paper  is  the 
development  of  algorithms  which  are  applicable  to  the  Stefan  problem  and  to 
Richards'  equation  for  fluid  flow  in  a  pc.ous  media,  see  Richards  (6],  Freeze  and 
Cherry  [1],  and  Paniconi  et  al.  [5].  Both  these  problems  have  similar  formulations, 
and  therefore,  some  of  the  algorithms  which  have  been  used  for  the  Stefan  problem 
may  be  applicable  to  the  more  complicated  Richards'  equation.  In  Silva  Neto  and 

White  [7]  and  [8]  a  local  nonlinear  SOR  algorithm  was  used  for  the  numerical  solution 
of  Richards'  equation.  In  the  present  paper  we  describe  methods  which  evolve  from 
the  contractive  algorithm  in  R.  E.  White  f  1 1],  and  this  is  reviewed  in  the  second 
section  of  this  paper.  Here  we  describe  nonlinear  SOR  and  nonlinear  ADI  iterative 
methods  which  can  be  used  for  two  and  three  space  dimension  problems.  We  will 
emphasize  the  implementation  on  vector/multiprocessing  computers. 

In  the  contractive  algorithm  the  linear  solve  step  can  be  approximated  by  an 
iterative  method,  and  an  induced  contractive  scheme  is  formed.  This  results  in  a 
nonlinear  nested  iteration  which  is  described  in  section  three.  In  sections  four  and 
five  we  study,  as  special  induced  contractive  methods,  the  nonlinear  SOR  and  ADI 

schemes.  Section  six  contains  numerical  illustrations  for  the  Stefan  problem.  In 

section  seven  the  similarities  of  the  Stefan  problem  and  of  Richard’s  equation  are 
discussed.  Computations  are  done  which  illustrate  that  this  method  gives  good 
comparisons  with  the  computed  and  with  the  observed  moisture  data  for  Brindabella 
loam  [10].  The  last  section  contains  the  conclusions  and  recommendations. 

§2.  The  Contractive  Algorithm.  In  this  paper  we  consider 

nonlinear  problems  of  the  form 

E  +  AtAp(E)  =  d  (1) 

where 

d,  E  e  9tN  , 

A  e  *NxN 

P(E)  =  [Pi(Ei)]  e  *N 

Pj:SR  — >  91  i  =  1, ... ,  N. 

For  the  Stefan  problem  E  is  the  enthalpy  and  P(E)  reflects  the  temperature.  The 
horizontal  part  of  Figure  1  corresponds  to  the  latent  heat. 


T 


E 


Figure  I.  Pj(E) 

Problems  of  the  above  form  evolve  from  the  implicit  time  discretization  of 
nonlinear  parabolic  equations.  The  matrix  A  is  from  the  elliptic  part  and  is  usually 
an  M-matrix,  see  [4],  or  a  symmetric  positive  definite  (SPD)  matrix.  Here  we 
emphasize  the  M-matrix  case  where  A  may  not  be  symmetric.  The  SPD  matrix  case 
can  be  studied  in  the  context  of  monotone  mappings,  see  sections  6.4  and  12.1  in  [4]. 

The  function  0(E)  =  [0j(Ej)J  e  91 N  where  Ej,  Ej  >  0  has  the  following  properties: 


(Ej  -  E  j)  cj  2  Pi(Ej)  -  0j(E  j). 

cj  >0, 

(2.1) 

Pi(Ej)  ■  Pj(E  j)  S  C2  (Ej  -  E  j). 

C2  SO, 

(2.2) 

Pj(Ej)  ^  C3  Ej  , 

c3  >0. 

(2.3) 

The  standard  example  is  from  the  Stefan  problem,  but  many  other  physical  problems 
can  be  put  into  the  above  form,  see  [2].  Later  we  will  emphasize  the  case  evolving 
from  fluid  flow  in  porous  media,  see  [1]. 

In  [11]  an  alternate  problem  was  proposed  so  that  (1)  could  be  solved  via 
successive  approximations.  It  has  the  form 

E  +  At  A(E)  E  =  d 

A(E)  =  [ay  (Pj(Ej)  /  Ej)]  and 
A  =  [ay]. 


where 


(3) 
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The  following  algorithm  was  studied  in  [11].  There  the  ajj  where  functions  of  E.  but 
for  the  present  we  assume  they  are  constants. 

Algorithm  1.  Contractive  iteration.  Consider  (3).  The  successive 
approximation  is 


Em+1  =  (I  +  At  A(Em))*l  d  ■  G(Em). 

If  At  is  small  enough,  then  p(At  A(E)>  <  5  <  1  for  all  E  >  0,  and  consequently, 
one  can  write  (I  +  At  A(E))' 1  in  the  form  of  a  geometric  series.  This  fact  and  some 
technical  arguments  were  used  to  show  that  G  is  a  contraction  on  some  box  in  9tN.  In 
practice  the  constraint  on  At  does  not  seem  to  be  severe.  The  following  is  a  special 
case  of  the  theorem  in  R.  E.  White  [11]. 

Theorem  1.  Consider  problem  (3)  and  Algorithm  1.  If  A  is  an  M-matrix  in  (1) 
and  each  0j  satisfies  (2.1)  -  (2.3),  then  for  suitably  small  At,  G(E)  is  a  contraction  on 
some  box  [eo,  2lldlloo]N. 

In  two  and  three  space  dimensions,  the  solve  step  will  most  likely  be  done  by 
an  iterative  algorithm.  Later  we  shall  focus  on  the  SOR  and  ADI  algorithms. 


§3.  Induced  Contractive  Method.  Let  S  c  91 N  and  G:S-»S  be 
contractive  with  0  <  c  <  1  and 


|G(£)-G(£)L£c|E-EL. 


Suppose  G  is  as  in  Theorem  1  where 


S  m  [eo,  2  lldlloo]N  and 
G(E)  ■  (I  +  At  A(E))*1  d. 


Algorithm  2.  Induced  contractive  iteration.  Let  I  +  At  A(E)  =  B(E)  -  C(E) 
be  a  splitting  that  defines  a  new  map  G:S— »S 

G(E)  =  EK 

where 

E°  =  E, 

Ek  =  B(E)'1  C(E)  Ek*l  +  BCE)'1  d  and 
1  <  k  <  K. 

Theorem  2.  Let  the  assumptions  of  Theorem  1  hold.  If  IIB(E)~ 1  C(E)lloo  <8<1 

for  all  E  e  S,  then  there  exists  a  constant  K  >  0  such  that  Em+1  »  G(Em),  where  E®e  S 

A 

and  G  is  given  by  Algorithm  2,  converges  to  the  solution  of  Es  =  G(ES)  e  S. 

Proof.  Let  H  =  B(E)‘l  C(E)  and  H  =  B(E)_1C(E). 

G(E)  =HEK-!  +  B-ld 

=  HkE+  [I  +  ...  +  HK1]  B*1  d 
=  HK  E  +  a  -  HK)  (I  -  H)*1  B'1  d 
=  Hk  E  +  (I  -  Hk)  (B  (I  -  H))'1  d 
=  Hk  E  +  (I  -  Hk)  G(E) 

=  Hk(E-  G(E))  +  G(E) 

If  E  =  Eg  =  G(Es),  then  G(ES)  =  Es. 

G(E)-G(E)  =  HK  (E-  G(E))  +  G(E)  -  HK(E-G(E))-G(E) 

=  HK  (E  -  G(E))  -  HK(E-G(E))  +  G(E)  + 

Hk(E-G(E))  -  HK(E-G(E»  -  G(E) 

=  HK(E-E)  -  Hk(G(E)-G(E))  +G(E)  -  G(E)  +  (HK  -HK)(E-G(E)) 

Let  E  =  Em  with  Em+1  =  G(Em),  and  E  =  Es  =  G(Es). 
gm+l.Eg  =  G(Em)  -  G(ES) 

=  HK  (Em  .  £s)  -  Hk  (G(Em)  -  G(Es))  +  G(Em)  -  G(Es) 


By  the  assumptions  on  H  and  G.  we  have 

llEm+l  .  Eslloo  ^  (5k  +  8k  c  +  c)  IIEm  -  Eslloo. 

Since  8,  c  <  1,  we  may  choose  K  so  that  8^  +  gK  c  +  c  ^  (l  +  c)  /  2  =  5  <  1. 
Thus  IIEra+l  -  Eslloo  £  5  IIEm  -  Eslloo  £  5m+1  HE®  -  Eslloo. 


Corollary.  Suppose  p(B^E)'l  C(E))  <  1  for  all  E  e  S.  Then  Em  converges  to  Es  provided 
K  =  K(m)  is  suitably  large. 
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Proof.  Since  pfBfE)'1  C(E))  <  1, 

H(E)k  =  (BCE)'1  C(E))k  ->  0  as  K  — > 

HEm+l  -  Eslloo  <  (IIH(Em)Klloo  +  IIH(Em)Klloo  c  +  c)  IIEm  -  Eslloo. 

For  large  enough  K  =  K(m)  we  have 

HEm+ 1  -  Eslloo  £  8  IIEm  -  Eslloo  <  &m+1  IIE°  -  Eslloo,  where  8  =  (1  +  c)  /  2. 


§4.  Nonlinear  SOR.  In  this  section  we  consider  the  SOR  version  of  the 
splitting  in  the  induced  contractive  method  in  Algorithm  2.  If  0  <  to  $  1,  then 
Algorithm  2  converges  to  the  solution  of  (1). 

Theorem  3.  Let  assumptions  of  Theorem  1  hold.  Consider  the  SOR  splitting 

1  1  —  © 

I  +  At  A(E)  =  ( —  (I  +  At  D(E))  -  At  L(E))  -  ( -  (I  +  At  D(E))  +  At  U(E)) 

CD  (0 

Moreover,  if  At  is  further  restricted  so  that  I  +  At  A(E)  is  uniformly  strictly  diagonally 
dominant,  then  we  may  choose  K  independent  of  m. 

Proof.  I  +  At  A(E)  is  an  M-matrix,  and  this  splitting  is  a  weak  regular 

splitting  for  all  E  e  S.  Thus  p(H(E))  <  1,  and  by  the  above  Corollary  Em  must  converge 
to  Es  provided  K  =  K(m)  is  suitably  large. 


If  we  restrict  At  so  that  I  +  At  A(E)  is,  uniformly  with  respect  to  E  €  S,  strictly 
diagonally  dominant,  then  we  can  show  that  the  assumptions  of  Theorem  2  hold.  This 
is  done  by  standard  method  and  is,  for  to  =  i. 


IS(£r'C(£)|  £  max - — _ - 


*5<1. 
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§5.  Nonlinear  ADI.  In  this  section  we  consider  the  ADI  version  of  the 
induced  contractive  method  in  Algorithm  2.  Here  the  analysis  is  based  on  M- 
matrices,  as  contrasted  with  SPD  matrices  in  Ortega  and  Rheinboldt  [4],  Later  this  will 
be  useful  as  the  analysis  does  not  require  symmetric  matrices  for  the  application  to 
Richards’  equation. 

Let  A  =  H  +  V  and  A(E)  =  H(E)  +  V(E)  where  H  and  V  represent  the  grid  rows  and 
grid  columns,  respectively.  H  =  (1/2)  1  +  At  H(E)  and  V  =  (1/2)  I  +  At  V(E)  so  that 
A(E)=  H  +  V. 

Theorem  4.  Let  the  assumptions  of  Theorem  1  hold.  Consider  the  ADI  splitting 

I  +  At  A(E)  =  B(E)  -  C(E) 

where 

B(E)*1  =  (a  I  +  V)*1  [I  +  (a  I  -  H )  (a  I  +  H)'1]  and 
C(E)  =  A(E)  -  B(E). 

There  exists  a  constant  ao  such  that  if  a  £  ao  ^  0.  then  Algorithm  2  converges  to  the 
solution  of  (1). 

Proof.  We  shall  show  that  the  splitting  is  a  weak  regular  splitting  of  the 
M-matrix  I  +  At  A(E). 

B(E)*1  =  (aI+  V)'1  (I  +  (a  I  -  H)  (a  I  +  H)*1] 

=  (a  I  +  V)’1  [(a  I  +  H)  +  (a  I  -  H)]  (a  I  +  H)' 1 
=  (a  I  +  V)-1  (2  a  I)  (a  I  +  H)_1. 

A  A 

By  the  conditions  on  0(E)  there  exists  ai  0  such  that  for  a  £  ai,  a  I  +  V  and  a  1  +  H 
are  M-matrices.  Thus  B(E)* 1  2  0.  Moreover,  there  exists  a2  £  0  such  that  for  a  2a2, 

A  A 

al-H,  a  I  -  V  2:  0.  Thus  for  a  ^  ao  =  max  (ai,  a2) 

B(E)*1  C(E)  =  (a  I  +  V)-1  (a  I  -  H)  (a  I  +  H)*1  (a  I  -  V)  ^  0. 

By  the  Corollary  to  Theorem  2  Algorithm  2  converges  to  the  solution  of  (1). 


§6.  Numerical  Experiments  for  the  Stefan  Problem.  In  this  section 
we  consider  the  numerical  solution  of  a  one-phase  Stefan  problem  in  two  space 
dimensions  where  the  domain  D  =  (0,1)  x  (0,1)  x  (0,T),  T  =  1/V2 .  The  exact  solution  of 
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where 


is 


Et  -  P(E)xx  '  P(E)yy  =  0 


[E+l, 

(3(E)  =  /0, 

U, 


E  <-l 
-1  <E  <0 
E  >  0 


E  = 


ret-(l//T)(*  +  y)_  I, 
-1, 


t-(l/yi)(x  +  y)2:0 
t-(lA/2)(x  +  y)<o’ 


The  initial  condition  on  E  and  the  Dirichlet  boundary  condition  on  P(E)  are 
implicitly  given  in  the  above  formula  for  E.  The  space  variables  were  discretized  by 
the  traditional  five  point  finite  difference  method  resulting  in  the  nonlinear 
algebraic  system  (1).  We  compared  the  performance  of  the  SOR  version  of  the 
induced  contractive  algorithm,  denoted  simply  as  FDM-SOR,  with  the  ADI  version  of 
the  induced  contractive  algorithm,  denoted  as  FDM-ADI.  The  computations  were  on 
the  vector/multiprocessing  computer  Cray  Y-MP.  The  Cray  was  used  as  a  single 

processor. 

The  parameters  set  for  this  algorithm  specified  the  number  of  nodes  in  each 
direction,  the  acceleration  variables  and  the  error  tolerance.  For  our  experiments, 

the  number  of  nodes  was  the  same  for  each  space  and  time  direction.  Thus,  the  time 
step  size  is  approximately  forty  times  the  maximum  allowable  time  step  size  for  this 
test  problem  and  for  the  explicit  method. 

Both  the  FDM-SOR  and  the  FDM-ADI  methods  showed  the  same  pattern  with 
respect  to  the  number  of  iterations  for  convergence.  The  tolerance  values  were  set 
at  ein  =  10*5,  eout  =  10*^,  for  the  inner  and  outer  iterations,  respectively.  The  number 
of  outer  iterations  was  three,  in  all  refinements  considered  of  a  basic  256  cell  (16  x 
16)  spatial  discretization,  which  we  will  subsequently  refer  to  as  the  basic  problem. 

The  number  of  inner  iterations  can  be  considerably  decreased  by  the  correct  choice 

of  the  acceleration  parameter.  For  the  SOR  version  of  Algorithm  2  we  used  co  =  1.5, 
1.6,  1.7,  and  1.8  for  the  number  of  cells  =  256,  1024,  4096,  and  16,384,  respectively.  For 
the  ADI  version  of  Algorithm  2  we  used  a  =  4.0,  6.0,  8.0,  and  10.0  for  the  number  of  cells 
=  256,  1024,  4096,  and  16,384,  respectively. 
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Our  intent  was  not  to  determine  an  optimal  value  for  the  acceleration 
parameter  but  rather  to  determine  an  appropriate  value  at  which  the  inner 
iterations  are  minimal  for  our  test  problem.  Our  numerical  experiments  showed  that 
the  appropriate  value  of  to  varied  between  1.5  and  1.8  for  all  refinements  of  the  basic 
problem.  Our  numerical  experiments  showed  that  the  value  of  the  acceleration 
parameter  a  increases  considerably  with  successive  refinements  of  the  basic 
problem.  Experiments  showed  that  a  can  be  chosen  without  much  consideration, 
which  is  an  advantage  of  the  FDM-ADI  version  of  Algorithm  2. 

Table  1.  CPU  time  (secs)  on  Cray-YMP, 

Algorithm  2/SOR 


ale  2  \  cells 

256 

1024 

4096 

serial 

EHM 

R9IBHI 

vector 

IB— 

mmmmmmmm 

In  our  experiments  we  used  two  versions,  serial  and  vector,  of  the  codes  that 
ran  on  the  vector/multiprocessing  computers.  The  purpose  here  was  to  determine 
the  degree  of  vectorizability  for  each  method.  Tables  1  and  2  show  the  CPU  times  for 
both  algorithms  to  solve  each  of  the  successive  refinements  of  our  basic  mesh.  In 
the  case  of  FDM-SOR  method  the  traditional  red-black  ordering  of  the  nodes  allows 
effective  vectorization.  In  the  case  of  FDM-ADI  method  we  implemented  the  vector 
version  of  the  basic  tridiagonal  solver. 


Table  2.  CPU  time  (secs)  on  Cray-YMP, 
Algorithm  2/ADI 


Tables  1  and  2  illustrate  that  the  FDM-SOR  method  exhibits  higher  speed-ups  as 
compared  to  those  of  the  FDM-ADI  method.  The  CPU  time  of  the  vector  FDM-SOR 
method  was  the  smallest.  Some  of  the  larger  CPU  time  for  the  FDM-ADI  method  is 
attributed,  in  this  case,  to  the  computation  and  storing  of  the  components  of  A(E). 

Next  we  considered  a  version  of  the  ADI  method  that  uses  variable  a.  The 
calculations  of  the  variable  a  were  done  using  the  scheme  in  [9]  where  a  =  At  and  b  = 
l/Ax^.  Although  this  scheme  was  designed  for  a  particular  boundary  value  problem, 
it  did  work  well  for  our  nonlinear  problem  and  was  fairly  robust.  Table  3  shows  the 
CPU  times  for  the  vectorized  SOR,  ADI  with  constant  a,  and  ADI  with  variable  a.  As  the 
number  of  unknowns  increases,  the  merits  of  the  ADI  method  with  variable  a  become 
clear. 

Table  3.  CPU  times  for  Algorithm  2 


§7.  Fluid  Flow  in  a  Porous  Media:  Richards'  Equation.  Richard 
equation  for  fluid  flow  in  a  porous  media  was  first  formulated  in  1931,  see  [6].  It  is  a 
nonlinear  parabolic  equation  for  the  pressure  head  =  ¥  where 

h  =  4*  +  z, 

and  h  is  called  the  hydraulic  head.  The  gravitational  direction  is  given  by  z.  Darcy's 
Law  is  used  to  form 

eCP)t  -  V  K(*P)  Vh  =  0  where 

K  =  hydraulic  conductivity  and  0  =  moisture  content.  They  are  empirical  functions  of 
4\  see  II]  and  [3]  and  Figures  2  and  3 


Figure  2.  K  =  K('P)  =  hydraulic  conductivity 


Figure  3.  8  =  e('F)  =  moisture  content 

In  references  [1]  and  [3]  it  is  noted  that  the  above  curves  may  differ  according 
to  "wetting"  or  "drying",  and  the  slopes  can  be  large.  The  functions  can  be 
approximated  by  rational  polynomials  of  order  4  or  S.  This  is  done  so  that  traditional 
analytic  methods  can  be  used;  however,  this  introduces  some  approximation  errors 
and  the  functions  are  expensive  to  compute. 


Richards'  Equation. 


1  2 


e(T)t  -  V-K(4*)  VH'  =  K(402 


dh 

Robin  boundary  conditions  are  typical  and  have  the  form  K('P) —  given  when  n  is  a 

dn 

unit  outward  normal  to  the  surface. 


In  a  recent  paper  by  Paniconi  et  ai.  [5]  several  numerical  methods  were 
described,  and  some  numerical  experiments  in  one  dimension  were  done.  In  the 
cases  were  the  slopes  of  8  and  K  were  large,  numerical  difficulties  were  encountered. 
The  moisture  function  6  (4*)  is  analogous  to  the  enthalpy  function  E  which  is  a 
discontinuous  function  of  temperature  in  the  Stefan  problem.  This  suggests  that  one 
may  be  able  to  apply  the  following  moisture  formulation  of  Richards'  equation. 

Let  E  =  0(4')  and  apply  the  Kirchhoff  transformation  to  the  hydraulic 
conductivity. 


4* 

u  =  F(4/)  s  J‘K(4')d4', 

4*o 

4»  =  p!(u), 

E  =  6(F-l(u)), 

P(E)  =  u  and 
7(E)  =  K(0-1(E)). 

Then  Richards'  equation  can  be  written  in  terms  of  E  where  E  is  the  primary 
unknown  and  represents  the  moisture. 

Moisture  Formulation  of  Richards'  Equation. 

Et  -  A  p(E)  =  y(E)z 

Here  the  term  on  the  right  side  makes  this  a  little  more  complicated  than  the 
Stefan  problem.  There  are  essentially  two  cases:  either  the  derivative  of  y  is  not  too 
large,  or  the  derivative  is  large  or  even  a  jump  discontinuity  occurs. 


1  3 

In  the  first  case  we  may  write  y(E)z  =  y'(E)  Ez  where  0  <  y’(E)  <  M  <  ».  a 
reasonable  implicit  time  discretization,  say  in  one  space  dimension,  is  for  Ej  from  the 
present  time  step  and  for  Et  from  the  previous  time  step 

+ >+ 20<£.  >  -  0<£... » -£.)=o 

Here  the  coefficient  matrix  is  more  complicated  than  in  (1),  but  it  is  still  an  M-matrix 
and  the  more  general  contractive  algorithm  in  [11]  can  be  used. 

The  second  case  is  when  y'(E)  is  so  large  that  y(E)  appears  to  have  jump 
discontinuities.  This  is  common  once  the  space  variables  have  been  discretized.  A 
special  case  has  been  studied  by  A.  Friedman  [2]  where  existence  of  a  weak  solution  to 
the  continuum  problem  is  established.  There  for  H  =  the  Heavyside  function 

0(40  =  a  *P  +  H(4»  -  4»o), 

K(¥)  =  ki  +  (k2  -  kj)  HOP  -  4*0), 

4>o  =  0,  k2  =  2  and  k]  =  1. 

Then  the  one  space  dimension  form  of  Richards'  equation  is 

(a  u  +  H(u))t  uXx  =  H(u)x. 

Let  E  =  a  u  +  H(u)  and  P(E)  =  u  so  that  E  =  a  P(E)  +  H(u),  H(u)  =  E  -  a  p(E)  and 

Et  -  Ex  -  P(E)xx  +  o  P(E)x  =  0. 

A  reasonable  discretization  has  the  form  of  (1)  or  (3).  Under  some  constraint  on  Ax 
A(E;  will  be  an  M-matrix,  and  the  theorem  in  [11]  will  be  applicable.  In  this  case, 

A(E)  is  tridiagonal,  and  hence,  there  is  no  need  to  use  an  induced  iterative  method. 

In  higher  space  dimensions  the  induced  iterative  methods  will  be  more  desirable. 


The  above  approximation  of  the  empirical  functions  is  very  restrictive.  In  ( 7 j 
and  [8]  more  accurate  approximations  are  used  so  that  diffusion  of  the  fluid  in  a 

partially  saturated  medium  can  be  calculated.  A  slightly  less  accurate  approximation 
which  will  track  a  wet-dry  interface  is  as  follows.  Here  we  have  set  I ff0  =  0. 

o  =  {a'¥+0"  ¥<° 

V*2y'  +  02,  yr£0 

and 

f*i,  ^<0 

1*2,  Y^0‘ 


In  our  calculations  we  shifted  the  horizontal  axis  to  the  right  so  that  6  and  K  have 
the  graphs  given  in  Figures  4  and  5.  We  used  the  K(0)  form  of  the  hydraulic 
conductivity  because  it  is  continuous. 


Figure  4.  0(u) 


-r-E,  E<6 \ 

K 

P(E)  =  ^0,,  ex<E<e2  , 

— *-(E-d2)  +  — l0l,  62<E 

.  *2  *i 


*(£)=  M££02 

itj,  02  <  & 


For  the  one  space  variable  problem  we  have 

£,-£(£)„-*<£),= 0  ,0 £x<Sl,r>0 
vvi'r/i  boundary  conditions 
P(E)X+K(E)  =  R  ,x  =  l,f  >0 
£(£),  + £(£)  =  0  ,x  =  0,f  >0 
anrf  initial  condition 
E-6r  ,0£x:£l,f  =  0. 

The  discretization  for  the  interior  cells  is 


£-£. + -!t<-/S(£,-i)  +  2/3(E.)- «£„,  *T(£,))  =  0. 

At  Ax  Ax 


The  matrix  in  (3)  is  formed  in  a  similar  way,  and  for  the  interior  cells 


a  (n.jjy 

“-,l  '  Ax1  £w  ’ 

_  1  1  ,,/><£.>  . 


Ar  Ax  £,  £, 


The  two  end  cells  incorporate  the  boundary  conditions  in  the  standard  way. 


In  the  following  calculations  we  used  the  parameters  that  reflect  Brindabella 

loam: 


6r  =.11  =  residual  moisture 
6{  =.27  =  from  the  moisture  data 
62  =.485  =  saturated  moisture 
k\  =  10-6 
*2  =.022 

R  =  0.016 5[m  /  hr]  =  flow  from  top 


No  flow  from  sides  or  bottom 
At  =.  125 [hr]  with  40  time  steps 
Ax  =.  300  /  20[m]  with  20  grid  cells 
0)  =  1.2  the  SOR  parameter 

=  10-6  absolute  error  for  inner  SOR 
£aut  =  10~*  absolute  error  for  outer. 


On  the  average  3  or  4  inner  iterations  and  5  to  8  outer  iterations  were  required 
for  convergence.  Figures  6a  and  6b  indicate  the  computed  and  observed  moistures. 
In  Figure  6b  the  diamonds,  triangles  and  squares  represent  the  observed  data  at  the 
times  1.0,  2.25  and  5.0  hours,  respectively.  Note  the  agreement  as  the  front 
progresses  from  the  top  (left)  to  the  bottom  (right).  Once  the  bottom  starts  to  fill  the 
hydraulic  conductivity  increases  and  the  diffusion  becomes  more  important.  In  [7] 
the  later  stages  of  this  are  modelled,  but  the  calculations  are  more  costly  than  in  the 
above  model. 
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Figure  6a.  Computed  Moisture 
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Figure  6b.  Observed  Moisture 

§8.  Conclusions.  We  have  given  an  analysis  of  induced  contractive 
methods  for  free  boundary  value  problems.  The  SOR  version  and  the  ADI  version  of 
Algorithm  2  seemed  to  perform  equally  well.  However,  the  SOR  version  was  very 
sensitive  to  the  choice  of  SOR  parameter,  and  this  contrasts  with  the  ADI  method 
which  was  observed  to  be  fairly  robust  with  respect  to  its  acceleration  parmater. 

Both  the  ADI  and  SOR  versions  vectorize  very  well.  In  3D  problems  with 
complicated  nonlinear  terms  we  expect  the  ADI  version  to  be  more  robust  and  to 
perform  equally  well  as  with  the  SOR  version. 

The  numerical  examples  ware  given  for  the  two  space  variable  Stefan 
problem.  Application  of  these  methods  to  Richards'  equation  was  outlined.  For  a  one 
space  variable  flow  through  the  Brindabella  loam  we  showed  that  the  computed  and 
observed  moisture  were  in  good  agreement. 
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ATTACHMENT  #2  Heterogeneous  Diffusion  and  the  Compact 
Volume  Method 


Heterogeneous  Diffusion 
and  the 

Compact  Volume  Methods 


by 


R.  E.  White1,  B.  N.  Borah2,  A.  J.  Kyrillidis1 


Abstract.  In  this  paper  we  study  heterogeneous  diffusion  in  the  context  of  heat 
conduction  in  laminated  material,  heat  conduction  with  phase  change  (the  Stefan  problem)  and  fluid 
flow  in  porous  media  (Richards'  equation).  We  discretize  these  problems  via  the  compact  volume 
method  (CVM).  We  illustrate  that  this  method  is  higher  order  accurate  in  the  flux  error  than 
traditional  methods.  The  resulting  nonlinear  algebraic  systems  are  approximated  by  versions  of  the 
SOR  and  ADI  schemes.  Vector  and  multiprocessing  implementations  are  presented,  and  they 
indicate  that  these  methods  are  suitable  for  high  performance  computing. 
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§1.  Introduction.  In  this  paper  we  consider  diffusion  problems  in 
heterogeneous  materials.  We  use  the  compact  volume  method  (CVM),  which  was  introduced  in 
M.  E.  Rose  [5],  to  discretize  the  governing  differential  equations.  The  advantage  of  CVM, 
relative  to  the  traditional  methods  such  as  finite  differences,  finite  elements  or  boundary  element 
methods,  is  that  CVM  is  higher  order,  in  both  the  solution  and  the  flux  errors,  accurate.  Here  we 
illustrate  quadratic  convergence  in  errors  in  ID  problems  related  to  heat  conduction  in  a  laminate, 
heat  conduction  with  phase  change,  and  fluid  flow  in  a  heterogeneous  porous  media. 

We  do  not  give  a  theoretical  analysis  of  the  convergence.  However,  the  resulting  algebraic 
systems  are  more  carefully  studied.  Some  iterative  methods  are  described,  and  criteria  for  their 
convergence  are  given.  Vectorization  and  multiprocessing  issues  are  discussed. 

In  section  two  we  describe  the  CVM  for  linear  heat  conduction  problems  with  discontinuity 
in  the  thermal  conductivity.  Sections  three  and  four  are  applications  to  the  Stefan  problem.  In 
section  four  we  describe  the  CVM  with  the  heat  balance  equation  at  the  solid-liquid  interface  and 
develop  a  scheme  which  converges  quadratically  for  the  solid-liquid  interface,  the  solution  and  the 
flux  errors.  The  fifth  section  is  an  application  to  Richards’  equation  in  ID  where  the  hydraulic 
conductivity  is  piecewise  linear  in  the  solution  and  has  a  jump  discontinuity  in  the  space  variable. 


§  2 .  CVM  for  Linear  Problems.  In  this  section  we  review  the  general  form  of 
CVM  as  presented  in  M.  E.  Rose  [5].  For  problems  of  the  form  -uxx  =  f  Rose  proved  quadratic 
convergence  in  both  u  and  ux.  In  order  to  establish  the  general  form,  consider  the  following 
example: 


-uxx  =  f  and  u(0)  =  0  =  u(l). 

Ui  U2  U3  U4  U5  U6  U7 

We  partition  the  [0, 1]  interval  into  4  cells.  The  even  nodes  are  inserted  to  insure  continuity 
of  the  flux  at  the  cell  boundaries.  Each  cell  has  length  equal  to  2  h. 


3 


i  =  odd: 

and 

i  =  even: 

Ui+1  -  Uj  Uj  -  Ui_j 

h  h  • 

The  resulting  algebraic  system  has  the  following  form  when  u  =  [uodd.  iWnlT  where  Uodd  is  the 
vector  of  odd  nodes  and  u*Ven  is  the  vector  of  even  nodes. 


[  Di  -^T«Ul  J4l 
l-c21  D2  L^2  J 

where 


4 = *2[/i  A  A  Af.  Ho  of. 

In  this  case  the  coefficient  matrix  is  irreducibly  diagonally  dominant,  an  M-matrix  and  also 
symmetric  positive  definite.  Therefore  iterative  methods  such  as  SOR  and  ADI  can  be  used. 
Moreover,  when  ordering  the  nodes  as  even  or  odd,  the  matrix  has  the  form  as  in  the  red-black 
ordering  for  the  FDM.  Consequently,  vector  pipelines  can  be  effectively  used. 

The  coefficient  matrix  has  a  block  structure  which  makes  it  easy  to  use  block  Gauss 
elimination.  For  example,  in  the  above  we  may  eliminate  Uodd  to  get 

[D2  -  C21  Df1  C12]  Ueven  =  d2  +  C21  Dj*1  dl  (2) 

where 

1  2  -1 

D2  -  C21  Dj*1  C12  =  ^7  -1  2  -1  . 

The  following  theorem  is  a  well  known  generalization  of  the  above. 


Theorem  1.  Consider  the  algebraic  system  in  (1).  If  Di  and  D2  -  C21  Df1  C12  are  nonsingular, 
then  (1)  has  a  solution  and  it  is  given  by  (2)  and  Di  Uodd  =  di  +  C12  ucven- 

The  next  example  illustrates  CVM  where  the  media  is  heterogeneous,  that  is,  the  coefficient 
in  the  differential  equation  has  a  jump  discontinuity  with  respect  to  the  space  variable. 

Example  1.  Heat  Conduction  in  a  ID  Laminate. 

-(k(x)  ux)x  =  10  x(l  -  x), 


u(0)  =  0  =  u(l)  and 


x  <0.5 
x  £  0.5  ‘ 


The  solution  is  formed  by  requiring  u  and  k  ux  to  be  continuous  at  x  =  0.5.  Both  the  CVM 
and  the  FDM  were  used.  The  FDM  has  n  cells  and  n  -  1  unknowns  and  the  CVM  has  n  /  2  cells 
and  n  -  1  unknowns.  Tables  1  and  2  illustrate,  for  this  example,  that  the  FDM  has  both  function 
and  derivative  errors  of  order  h  while  the  CVM  has  both  errors  of  order  h2. 


FDM; 


CVM; 


i  =  odd; 


_rk.  1  t  az&ii.hf. 

14  h  4  h  J  ni> 


where  1  £  i  £  n  and  k^  =  ^(k(xj)  +  kfx^j)). 


where  k^j  =  kCxy.!  +  e),  e>0  and 


1  =  even: 


Uj+i-Uj  Uj-Uj-! 

Ki  h  "Ki  h 


where  kf  =  k(xj  ±  e),  e  >  0. 
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Table  1.  FDM  for  Heat  Conduction  in  Laminate 


n\  error 

Fct.  Error 

Flux  Error 

8 

0.002456 

0.159  566 

16 

0.001227 

0.081413 

32 

0.000750 

0.041091 

64 

0.000495 

EBS5MB1 

128 

EEBnl ■ 

0.010342 

Table  2.  CVM  for  Heat  Conduction  in  Laminate 


n\error 

Fct  Error 

Flux  Error 

8 

ftfoMF! 

«■»' 

16 

32 

64 

0.000235 

128 

The  CVM  generalizes  to  the  2D  domain  in  the  obvious  way  when  a  rectangular  grid  is 
used  If  an  irregular  2D  domain  is  required  then  one  may  use  a  variation  of  the  boundary  element 
method  (see  E.  K.  Bruch  [1])  and  finite  element  method  (see  R.  E.  White  [10]).  For  example,  if 
we  wish  to  use  triangular  elements  and  linear  shape  functions  on  -Vk  Vu  =  f,  then  we  can  insert 
an  additional  center  node  (node  o)  in  each  element  (These  are  analogous  to  the  odd  nodes  in  the  ID 
case),  see  Figure  1. 
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Figure  1.  CVM  for  2D  Element 

First,  we  apply  the  boundary  element  idea  to  each  element  and  solve  for  u  at  each  center 
node.  This  amounts  to  applying  the  Divergence  Theorem  to  each  element.  Let  q  =  -k  Vu.  Then 
integrate  Vq=  f  over  e 


jimi 


By  the  Divergence  Theorem 


lqnds“JI 


Here  we  assume  u  is  known  at  the  nodes  1, 2  and  3.  The  u  can  be  expressed  as  linear  function 
above  each  subelement  «i,  C2  and  e3.  Thus,  the  above  boundary  element  equation  has  just  u  at  the 
center  nodes  as  the  only  unknown. 

Second,  we  form  the  equations  for  each  unknown  at  the  vertex  nodes  of  every  element 
(These  are  analogous  to  the  even  nodes  in  the  ID  case).  This  is  done  by  requiring  q  to  be 
continuous  at  each  edge  of  every  element.  This  takes  the  form  of  3  simultaneous  equations  for 
every  element,  and  these  are  easily  solved.  One  can  also  combine  the  four  equations  and 
simultaneously  solve  for  the  four  unknowns  in  every  element.  In  either  case  the  resulting  system, 
via  assembly  by  elements,  can  be  solved  directly  or  iteratively. 
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§3.  Application  to  the  Stefan  Problem.  The  FDM  when  applied  to  free  B VPs 
generates  nonlinear  algebraic  problems  of  the  form 

E  +  At  A  (5(E)  =  d 


where  A  is  a  matrix  associated  with  the  elliptic  part.  The  compact  volume  method,  CVM,  as 
described  in  [6, 7, 11]  generates  a  larger  system  of  the  form 


+ 


D, 

-C2i 


(3) 


where  Dj,  D2  are  diagonal  matrices,  C12  and  C21  are  rectangular  matrices.  (5(E)  is  as  in  the  FDM 
and  is  the  temperature  at  the  center  of  the  cells.  And  u  is  the  temperature  at  the  boundary  of  the 
cells.  The  diffusion  equation  for  each  cell  (i  =  odd)  is 


Ei 

At 


2h 


u^-pOEi)  PCE^-Ui.! 

h  h 


di. 


(4) 


The  flux  continuity  equation  at  cell  boundary  (i  =  even)  is 


1 

[P(EW)-Ui 

ui-P(Ei_i) 

2h 

h 

h 

(5) 


Equation  (3)  should  be  viewed  as  a  first  version  of  the  CVM  method.  We  will  want  to  write  it  in 
the  alternate  form,  as  in  the  FDM, 


— +Dj(E)  -C12 
-C2i(E)  D2 


(6) 


where  D^Jsdiag 


di 


P(Ei)l 


Ei 


P(E)  P(E) 

=  D!  — g—  and  C21(E)  =  C21 -g— . 


In  two  and  three  dimensions  the  problem  is  more  complicated,  but  it  appears  that  ADI  schemes  will 
be  useful  in  approximating  the  solution  to  such  problems. 
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Consider  the  simplified  problem  (6).  If  D2  is  nonsingular,  then  we  may  solve  for 
u  =  D21[d2  +  C21(E)E]. 

Then  — -Ci2D2  C2i(E)  +  D!(E)  E  =  di  +  Ci2D2  d2 

^  +  (D,  -  C,2  Di1  Cj,)  E  =  d,  +  C12  Di*  d2 . 

Or,  —  +  (D,-C12Di1C21)|5(E)  =  d.  (7) 


Thus,  the  reduced  problem  in  (7)  has  the  form 

Jj  +  AP(E)  =  d 

where  A  is  a  symmetric  M-matrix.  Provided  A  and  d  have  been  computed,  SOR  and  ADI  methods 
as  described  in  section  6.4  and  12.1  in  [3],  will  be  applicable. 

A  more  direct  approach  is  to  consider  the  initial  problem  in  (6).  Algorithms  1  and  2  are 
contractive  as  in  the  discretization  given  for  the  FDM,  see  [1 1]. 


Algorithm  1. 


Contractive. 


5+D'(E”') 

-C2i(Em) 


-C12 

D2 


(8) 


Theorem  2.  If 


Di  -C12 
-C2i  d2 


is  an  M-matrix,  then  for  suitably  small  At,  Algorithm  1  converges 


to  the  unique  solution  of  (6). 
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Algorithm  2.  Induced  Contractive. 


Let 


ir®'® 

~C\2 

-C2i(E) 

D2 

=  B(E)  -  C(E)  be  a  splitting  of  the  matrix  in  (6). 


Let 


Em+1'0 

^«+l,0 


and  define  the  iterates  for  0<  k  <  K-l 


(9) 


The  next  theorem  is  a  direct  application  of  the  results  in  [1 1]. 

Theorem  3.  If  pfBfE)-1  C(E))  £  8  <  1  and  the  assumptions  of  Theorem  1  holds,  then  for  K 
suitably  large  Algorithm  2  converges  to  the  solution  of  (6). 

Example  2.  One  phase  Stefan  Problem  in  2D  and  Algorithm  2. 


In  this  example  we  use  Algorithm  2  to  compute  the  numerical  solution  of  a  one-phase 
Stefan  problem  in  two  space  dimensions  where  the  domain  D  =  (0,1)  x  (0,1)  x  (0,T),  T  =  1/V2 . 
The  exact  solution  of 


Et-p(E)xx-3(E)yy  =  0 

where 

(E+l,  E  <-l 

3(E)  =  <  0,  -l^ESO 

(E,  E>0 

is 

fet-(i//2)(x+y)_  !,  t  -  (1/^2)  (x  +  y)  £  0 
E  =  < 

l-l,  t  -  (1/^2)  (x  +  y)  <  0 


The  SOR  splitting.  Note  this  is  vectorizable  because  Dj  and  D2  are  diagonal  matrices.  The 
block  structure  in  (6)  reflects  the  red-black  ordering.  Vectorizarion  is  easily  implemented. 

The  ADI  splitting.  Here  the  nodes  in  (6)  must  be  reordered  to  reflect  the  classical  order.  For 
the  above  example  the  unknowns  will  be  ordered  as 

[Ej,  u2,  E3,  u4,  E5,  u6,  E7]t 

This  gives  collections  of  independent  tridiagonal  systems,  and  hence,  vector  versions  of  tridiagonal 
solvers  can  be  used. 

This  example  illustrates  the  SOR  and  ADI  splitting  of  Algorithm  2  on  the  above  Stefan 
problem.  We  compared  the  performance  of  the  SOR  version  of  Algorithm  2,  which  we  will  denote 
simply  as  CVM-SOR,  with  the  ADI  version  of  Algorithm  2,  denoted  as  CVM-ADI.  The 
computations  were  done  on  the  Cray  Y-MP.  The  parameters  set  for  Algorithm  2  specified  the 
number  of  nodes  in  each  direction,  the  acceleration  variables  and  the  tolerance.  For  our 
experiments,  the  number  of  nodes  was  the  same  for  each  space  and  time  direction.  Thus,  the  time 
step  size  is  approximately  forty  times  the  maximum  allowable  explicit  time  step  size  for  this  test 
problem. 

The  tolerance  values  were  set  at  =  10-5,  Eout  =  10"4,  for  the  inner  and  outer  iterations, 
respectively.  We  considered  four  refinements  of  the  spatial  discretization,  and  these  were  for  n  = 
16, 32, 64,  and  128  cells  in  each  direction.  Our  intent  was  not  to  determine  an  optimal  value  for 
the  acceleration  parameters  but  rather  to  determine  an  appropriate  value  at  which  the  inner  iterations 
are  minimal  for  our  test  problems.  For  the  CVM-SOR  method  numerical  experiments  showed  that 
the  appropriate  values  of  co  were  1.5,  1.6, 1.7,  and  1.8  for  n  =  16,  32,  64  and  128,  respectively. 
For  the  CVM-ADI  method  numerical  experiments  showed  appropriate  values  of  acceleration 
parameter  a  were  4.0,  6.0,  8.0  and  10.0  for  n  =  16,  32,  64,  and  128,  respectively.  Experiments 
showed  that  a  can  be  chosen  without  much  consideration,  which  is  an  advantage  of  the  CVM-ADI 
version  of  Algorithm  2.  We  also  considered  a  version  of  CVM-ADI  that  uses  a  more  sophisticated 
acceleration  scheme  with  variable  Oj. 


Table  3.  CPU  time  (secs)  on  Cray  Y-MP,  for  CVM-SOR 


alg  2\n 

n  =  16 

n  =  32 

n  =  64 

n  =  128 

serial 

0.06 

0.76 

10.27 

125.58 

vector 

0.02 

0.12 

iill 

12.11 

In  our  experiments  we  used  two  versions,  serial  and  vector,  of  the  codes  that  ran  on  the 
Cray.  The  purpose  here  was  to  determine  the  degree  of  vectorizability  in  each  method.  Tables  3 
and  4  show  the  CPU  time  both  algorithms  took  to  solve  each  of  the  successive  refinements  of  our 
basic  mesh.  In  the  case  of  CVM-SOR  method  the  block  structure  of  the  problem  reflects  the 
traditional  red-black  ordering  of  the  nodes  and  thus  allows  for  effective  vectorization.  In  the  case 
of  CVM-ADI  method  we  implemented  the  vector  version  of  the  basic  tridiagonal  solver. 


Table  4.  CPU  time  (secs)  on  Cray  Y-MP,  for  CVM-ADI 


alg2\n 

n  =  16 

n  =  32 

n  =  64 

OO 

CN 

It 

a 

serial 

0.31 

2.58 

tisz 

254.21 

vector 

0.13 

o.77  : 

IHHii 

50,55 

Clearly  the  CVM-SOR  method  exhibits  higher  speedups  as  compared  to  those  of  the  CVM- 
ADI  method.  Both  methods  exhibit  good  accuracy.  As  noted  above  we  also  considered  a  version 
of  CVM-ADI  that  uses  variable  acceleration  parameters,  Oj.  Table  5  shows  the  CPU  time  of  the 
vectorized  CVM-SOR,  CVM-ADI  and  the  CVM-ADI  with  variable  cti’s,  denoted  as  ADIv.  Here 
we  see  that  the  variable  a  version  of  CVM-ADI  is  competitive  with  CVM-SOR,  and  the  CVM- 
ADIv  seems  to  be  more  robust  than  the  CVM-SOR.  These  results  are  consistent  with  those  in  [1 1] 
where  the  FDM  is  considered. 

Table  5.  CPU  times  of  the  Versions  of  Algorithm  2 


alg\n 

n  =  16 

n  =  32 

n  =  64 

n=  128 

SOR 

0.02 

_  Hi 

0.12 

1.14 

12.11 

ADI 

0.13 

0.77 

6.28 

50.55 

ADIv 

0.05 

0.27 

1,43 

10.12 

1  2 


The  single  sweep  ADI  method  also  works  quite  well  for  two  dimensional  CVM 
discretizations  of  the  form  (6).  Let 


Dj  -C12 
-C2i  D2 


=  H  +  V, 


H  = 


2S  0 
0  0 


+  H  and 


V  = 


I 

2At 


0 

0  0 


+  v. 


H(E)  and  V(E)  are  consistent  with  the  notation  in  (6).  Then  we  can  write 


— +Di(E)  -C12 

-C2i(E)  D2 


=a+H(E))-a-v(E»=a+v(E))-a-H(E)). 


From  here  one  can  use  the  two  step  ADI  sweeps  (See  Rose  [6]),  or  implicit  single  ADI  sweeps  as 
follows. 

Algorithm  3.  Single  ADI  sweep. 

Consider  the  time  dependent  CVM  method  where  a  sequence  of  algebraic  problems  of  the 
form  (6)  are  considered.  Let  n  =  time  step.  The  single  sweep  ADI  method  is  for  E  =  [E,  u]T 

(a  I  +  H(E"+i/2))  En+i/2  =  (a  I  -  V(En))  E"  +  d  and  (10.1) 

I  (a  I  +  V(En+1»  En+1  =  (a  I  -  H(E"+1/2))  +  d.  (10.2) 

■ 

I 


The  nonlinear  problems  (10.1)  and  (10.2)  may  be  approximated  via  the  contractive 


mapping;  for  example,  in  (10.1)  with  d  the  right  side  of  (10.1) 

Ek+1  =  (a  I  +  H(Ek))-i  d. 

Here  a  must  be  suitably  large  and  can  be  used  adaptively  to  obtain  optimal  convergence. 

Example  3.  One  phase  Stefan  problem  and  Algorithm  3. 

This  example  illustrates  Algorithm  3  and  the  use  of  vector/multiprocessing  computers. 
Consider  the  above  2D  one  phase  Stefan  problem.  The  computations  in  Table  6  were  done  on  an 
Alliant  FX/40  and  Cray  Y-MP.  The  Alliant  FX/40  has  two  processors  each  with  vector  pipelines 
and  170  nsec  clock  cycle  time,  and  the  Cray  Y-MP  was  used  with  one  CPU,  vector  pipeline  and  4 
nsec  clock. 

The  CVM  space  discretization  allows  effective  use  of  independent  tridiagonal  solvers  in 
either  the  form  of  vectorized  tridiagonal  or  multiprocessing  tridiagonal  algorithms.  In  the  case  of 
the  Alliant  FX/40  both  versions  were  used  for  the  (2  epu,  vector)  calculations.  Table  6  records  the 
computing  times  (sec)  for  n  cells  in  each  direction  and  five  different  computer  configurations. 

Table  6.  ADI  and  CPU  Times 


computer  \n 

n  =  16 

n  =  32 

n  =  64 

Alliant  (serial) 

z.ui  . 

122.28 

Alliant  (vector) 

1.02 

IHII 

38.63 

Alliant  (2  epu,  vector) 

0.57 

3.32 

21.07 

Cray  (serial) 

w  «mm 

0.129 

0.89/ 

IvXwXvA'X'i'XvXvXvXv'vX'lviv!'!' 

ill 

Cray  (vector) 

0.32 

1*72 

§4.  Application  to  the  Stefan  Problem:  Higher  Order  Approximation. 

A  more  complicated  problem,  in  one  dimension,  is  the  combination  of  (6),  (11),  (12)  and  (13). 
Here  we  will  use  an  nonlinear  SOR  scheme  and  use  the  generalization  of  M-matrices  to  M- 
functions  as  described  in  section  13.5  of  [3].  More  importantly,  we  will  want  to  consider  the 
location  of  the  solid-liquid  interface;  here  the  flux  is  not  continuous  as  in  (5). 
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Suppose  that  at  some  iteration  we  observe  P(Ej)  >  uf  and  P(E2)  <  uf.  Between  Xj  and  x2 
there  must  exist  an  s  such  that  j3(E  at  s)  =  uf ,  see  Figure  2. 


Figure  2.  Phase  Change 


Thus  (5)  must  be  modified  in  three  ways. 
At  xltwe  are  in  a  liquid  cell  [xq,  s]  and 


Ei 

1 

Uf-  P(Ef) 

P(Ef)— u0 

At" 

s-x0 

S  —  Xj 

h 

=  di 


(11) 


At  s,  we  have  the  classic  heat  balance  equation  and 


uf-  p(Ef) 

P(E2)  -  Uf 

S  —  Xj 

X2-S 

(12) 


At  x2,  we  are  in  a  solid  cell  [s,  x3]  and 


E2 

1 

u3-p(E2) 

P(E2)-uf 

At 

x3-s 

h 

x2  —  s 

—  di 


(13) 
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Definition:  The  CVM  for  the  one  dimensional  free  BVP  has  the  form  of  (6)  supplemented  with 
equations  (11),  (12)  and  (13). 

The  reader  will  note  that  equation  (12)  is  a  cubic  with  three  distinct  real  roots.  If  we  require 

\s-s\<h,  (14) 

then  we  must  select  the  middle  root  The  condition  in  (14)  is  a  discretization  of  ^|Ar  <  h  which  has 
been  noted  as  an  important  constraint  to  avoid  oscillations  near  the  solid-liquid  interface,  see  [7]. 


Figure  3.  Solution  of  the  Cubic  Equation  (12) 

This  seems  to  be  a  CFL  type  condition  for  the  moving  solid-liquid  front.  The  analysis  of 
(3)  or  (6)  can  be  done  in  either  the  context  of 

Di  “Ci2 

-C2i  d2 
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being  a  M-matrix  or  a  SPD  matrix.  For  the  M-matrix  case  the  system  consisting  of  (6)  coupled 
with  (11),  (12)  and  (13)  is  most  likely  an  M-function,  see  [3].  Consequently,  nonlinear  SOR 
methods  are  applicable.  Indeed,  careful  inspection  of  Figure  2  gives  the  off-diagonal  antitone  and 
the  diagonally  isotone  conditions  for  the  component  of  the  system  for  the  unknown  interface.  The 
other  nodes  have  equations  which  are  similar  to  the  reduced  enthalpy  formulation  of  the  Stefan 
problem,  and  the  reduced  system  is  known  to  be  an  M-functions. 

Algorithm  4.  Enhanced  SOR  for  two  phase  Stefan  problem. 

Consider  (6),  (11),  (12)  and  (13)  where  the  nodes  i  =  1  and  2  represent  a  possible  change  in  phase 
at  some  point  in  the  iteration,  require  [s  - 1|  £  h  so  that  we  must  chose  the  center  root  of  (12): 

do  an  nonlinear  SOR  sweep  of  (6) 
solve  (12)  for  the  center  root 
solve  (11)  and  (13) 
repeat  until  convergence. 

The  advantage  of  the  more  general  CVM  problem  (6),  (11),  (12)  and  (13)  is  that  the  order 
of  convergence  is  higher  than  (6),  or  the  FDM.  This  has  been  observed  for  a  number  of 
experiments  for  one  dimensional  problems.  The  extension  to  two  or  three  space  dimensions  will 
be  a  little  more  challenging.  It  appears  that  the  solid-liquid  interface  condition  in  higher  dimensions 
can  be  decoupled  into  equations  similar  to  (11),  (12),  and  (13).  This  suggests  that  a  nonlinear 
ADI  scheme  can  be  developed.  In  each  direction  a  SOR  algorithm  similar  to  Algorithm  4  can  be 
used.  Thus,  we  hope  for  two  and  three  space  dimensions,  to  be  able  to  obtain  higher  order 
convergence  of  the  discrete  problem  to  the  continuum  problem,  and  to  be  able  to  more  accurately 
locate  the  solid-liquid  interface.  The  above  schemes  have  a  large  portion  of  independent  parts,  and 
hence,  high  performance  computing  can  be  used. 

Example  4.  Two  phase  Stefan  problem  and  solid-liquid  interface. 

E,  -  P(E)„  =  0 


where 


m  =  f,’ 

l(E-2)+l, 


E  <  1 
1  <E  £2 . 
E>2 


The  boundary  and  initial  conditions  were  chosen  so  that  the  following  was  the  solution 


£  /2elx,  x<t 

\el"\  x£t 

The  solid-liquid  interface  is  given  by  s(t)  =  t.  In  the  calculation  we  started  at  t  =  0.2  and 
ended  at  t  =  0.37.  Over  this  time  interval  the  average  function  and  flux  errors  were  tabulated  as 
well  as  the  approximate  value  of  the  solid-liquid  interface.  Tables  7  illustrates  quadratic  (Ax2) 
convergence  for  At  =  Ax2  and  this  example,  in  the  solid-liquid  interface  error  and  the  function 
error.  The  convergence  of  the  derivative  error  was  at  least  linear  (Ax)  in  all  cases. 

Table  7.  CVM  for  Stefan  Problem  with  At  =  Ax2 


n\  error 

s  Error 

Fct  Error 

Der.  Error 

10 

0.003900 

.  ^  gs 

0.007000 

0.030100 

20 

0.001000 

0.014100 

40 

80 

0.000060 

fi:  onni a  7 

MW*?*  . ''' . 

0.002270  1 

160 

0.000016  s 

isstiii'f  •. 

Even  if  the  criteria  on  At  is  relaxed,  the  errors  remain  relatively  small  as  is  illustrated  by 
Table  8. 


Table  8.  CVM  for  Stefan  Problem  with  At  =  0. 1  Ax 


n\  error 

s  Error 

Fct.  Error 

Der.  Error 

^^^^^88888888888888888^^^^^ 
QBSSHn ■ 

0.013500 

40 

EESBl 

— 

80 

mmmm 

ROBI 

160 

miijikiimimli 

§5.  Application  to  Richards'  equation.  Richards  equation  describes  fluid 
flow  in  a  porous  media  and  was  initially  formulated  in  L.  A.  Richards  in  1931  [4],  Presently,  it  is 
being  extensively  used  in  ground  water  modeling  and  in  contamination  modeling,  see  R.  A.  Freeze 
and  J.  A.  Cherry  [2].  This  equation  has  strong  nonlinear  terms  in  the  empirical  functions  for 
moisture  and  hydraulic  conductivity,  see  [11].  Moreover,  the  porous  media  are  often 
heterogeneous;  that  is,  they  are  dependent  on  space  position  and  often  have  discontinuities  in  the 
space  variable.  A  simple  ID  example  of  the  steady  state  Richards'  equation  is  as  follows  where  u 
is  the  pressure  and  k(x,u)  is  the  hydraulic  conductivity. 


Example  5. 


where 


Richards'  Equation  in  ID. 


-(k(x,u)  ux  +  k(x,u))x  =  f(x) 


u(0)  =  0,  u(l)  =  1  +  ai  +  ao  and 


k(x,u) 


/  3  u  +  0.5, 
\u+  1, 


x  <0.5 
x  2t  0.5  ’ 


The  right  side  was  chosen  so  that 


+  a,  x  +  ao. 


x<0.5 

x  £  0.5 


was  a  solution.  The  constants  ai  and  ao  were  chosen  so  that  u  and  k  ux  +  k  were  continuous  at  x  = 
0.5.  The  CVM  has  the  form 

i  =  odd:  -[^1J4apL_*wiirf!i=l.+*w =  2V, 

where  ki±1  =  k(xi±l  Te,ui±l),  e>0  and 

i  =  even:  k*  — — —  +  k*  =  kT  — — —  +  k~ 

h  h 

where  k*  =  k(x ± £,m, ),  e > 0. 


The  e  >  0  are  used  to  insure  the  hydraulic  conductivity  is  taken  from  the  appropriate  cell. 


Table  9.  CVM  for  Richards' Equation 


In  the  calculations  recorded  in  Table  9,  n  /  2  are  the  number  of  cells,  and  there  are  n  -  1 
unknowns.  Nonlinear  SOR  was  used  with  the  hydraulic  conductivity  being  updated  as  soon  as 
any  variables  were  computed.  The  function  and  flux  =  k  ux  +  k  errors  were  the  max  norm  at  the 
center  of  the  cells  and  cell  boundaries,  respectively.  Inspection,  of  the  table  entries  indicate  the 
CVM  has  quadratic  convergence  in  both  the  function  and  flux  errors. 

The  above  example  illustrates  a  fluid  flow  in  a  porous  medium  which  has  two  layers  where 
the  pressure  does  not  vary  over  a  large  range.  Hence  the  hydraulic  conductivity  is  a  piecewise 
linear  function  of  position  and  pressure.  Extension  to  the  time  dependent  2D  problem  where  the 
empirical  functions  depend  only  (hi  the  pressure  are  given  in  [1 1]  and  its  references.  These  results 
combine  to  suggest  that  the  general  problem  can  be  solved  in  the  present  context 

§6.  Conclusions.  We  have  shown  for  a  number  of  heterogeneous  diffusion 
problems  that  the  CVM  discretization  method  gives  higher  order  error  estimates  for  both  the 
function  and  flux  errors.  Moreover,  we  have  illustrated  that  both  the  ADI  and  SOR  algorithms  can 
be  adapted  to  solve  the  resulting  nonlinear  algebraic  systems.  Vectorization  and  multiprocessing 
computers  can  be  use  to  effectively  execute  these  algorithms. 

In  the  applications  to  the  heat  conduction  in  a  laminate  material,  the  Stefan  problem,  and 
Richards'  equation  we  have  indicated  how  one  can  extend  the  CVM  to  2D  and  3D  space  problems. 
All  these  problems  are  of  current  interest  to  physical  scientists  and  engineers.  Moreover,  the 
mathematical  analysis  of  convergence  for  the  more  complicated  versions  of  CVM  is  certainly 
needed. 
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ABSTRACT 

This  paper  describes  an  SOR  algorithm  for  solving  the  nonlinear  algebraic  system  which 
evolves  from  Richards'  equation  that  models  fluid  flow  in  a  porous  media.  The  moisture 
content  and  hydraulic  conductivity  functions  are  approximated  by  piecewise  linear 
functions  obtained  from  field  data.  The  resulting  algebraic  system  is  solved  by  a  variation 
of  the  nonlinear  SOR  algorithm.  The  advantage  of  this  approach  is  that  it  avoids  some  of 
the  numerical  oscillations  associated  with  large  derivatives  in  the  data.  Numerical 
calculations  are  presented  and  illustrate  the  following:  (i)  agreement  of  the  numerical 
model  with  observed  data,  (ii)  dependence  and  comparison  results  as  a  function  of 
uncertain  data,  and  (iii)  suitability  of  these  algorithms  for  multiprocessing  computations 
via  domain  decomposition  methods.  Extension  of  these  algorithms  to  heterogeneous 
porous  media  fluid  flow  are  discussed. 

'The  calculations  were  done  at  the  North  Carolina  Supercomputing  Center. 

Supported  by  CNPq  and  Promon  Engenharia  from  Brazil. 
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1.  INTRODUCTION 

The  study  of  fluid  flow  in  porous  media  has  several  important  applications  in 
engineering  (Kaviany  1991  and  Nield  and  Bejan  1992)  Specific  examples  are:  filters  for 
industrial  use,  or  separators  in  aerospace  fuels  (Kaviany  1991);  use  of  geothermal  energy 
(Rae  et  al.  1983  and  Kimura  1989,  1989a);  oil  recovery  (Bear  1972);  groundwater 
(Mark  1992,  1993  and  Clothier  et  al.  1981)  and  agriculture  (Feng  1993). 

Industrial  chemical  or  radioactive  effluents  are  sometimes  deposited  at  the  surface 
or  in  drums  that  are  buried  underground.  In  both  normal  operations  and  in  accidental 
conditions,  it  is  required  to  give  an  analysis  of  the  transport  of  the  contaminants  through 
the  soil  (Muralidhar  1990,  1993).  The  first  step  in  this  analysis  is  the  mathematical 
simulation  of  fluid  flow  through  the  soil. 

Richards  (1931)  developed  an  equation  that  is  a  combination  of  the  continuity 
equation  and  Darcy's  law  (Philip  1969),  and  it  models  fluid  flow  in  a  porous  medium.  It  is 
a  nonlinear  parabolic  partial  differential  equation  which  contains  the  empirical  functions 
for  moisture  content  0(h)  and  hydraulic  conductivity  K(h). 

0(h),  -  V  K(h)Vh  -  K(h)t  =0  in  Qx(0,T)  (la) 

where  h  is  the  hydraulic  pressure  head,  z  is  the  vertical  direction  and  £2  is  the  space 
domain.  The  boundary  condition  of  the  third  kind  has  the  form 

\  K  (h)V  (h  +  z)\-n  =  given  inSx(0,T)  (lb) 

where  n  is  the  unit  outward  normal  to  £2  and  S  is  the  boundary  of  £2  The  initial  condition 
is 

h  =  given  fort  =  0  in  £2.  (1c) 
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In  general  the  equations  ( la-  lc)  are  coupled  with  a  parabolic  system  of  equations  for  the 
transport  of  a  number  of  contaminants  through  the  soil  (Freeze  and  Cherry  1979  and  Feng 
1993).  This  is  done  by  using  the  fluid  velocity  v  =  K(h)V(h  +  z)  which  is  computed  from 
the  above  system 

In  practice  the  empirical  functions  for  moisture  content  and  hydraulic  conductivity 
have  several  troublesome  properties  First,  they  can  have  large  derivatives,  and  this  is 
often  the  case  for  hydraulic  conductivity.  Second,  they  are  not  precisely  known.  Third, 
they  can  have  strong  space  dependence  with  jump  discontinuities  resulting  from 
heterogeneous  porous  media.  The  objective  of  this  paper  is  to  give  an  approach  to  these 
problems  which  is  based  on  methods  used  for  the  Stefan  problem  (Silva  Neto  and  White 
1993).  Particular  attention  will  be  given  to  the  first  problem  where  there  is  no  space 
dependence.  In  the  case  of  space  dependent  empirical  functions,  one  can  use  additional 
nodes  and  the  continuity  condition  on  the  fluid  velocity  (White  et  al.  1993)  to  generalize 
the  methods  of  this  paper 

Traditional  methods  for  the  solution  of  (la-lc)  use  an  approximation  of  the 
empirical  data  by  exponential  functions  (van  Genuchten  and  Nielsen  1985).  Then 
numerical  methods  such  as  Newton,  Picard,  or  Lees  implicit  factored  method  can  be  used 
for  problems  without  large  derivatives  (Paniconi  et  al.  1991).  In  addition  to  addressing 
the  above  problems,  the  approach  of  this  paper  does  not  involve  expensive  function 
evaluations  and  does  eliminate  the  numerical  oscillations  associated  with  large  derivatives 
of  the  empirical  functions. 

In  section  two  we  present  the  general  approach  to  the  problem  which  is  adapted 
from  the  Stefan  problem.  Here  the  empirical  functions  are  approximated  by  piecewise 
linear  functions  which  reflect  the  field  data  (White  and  Broadbridge  1988).  The  partial 
differential  equation  is  discretized  by  the  finite  difference  method,  and  the  resulting 
nonlinear  system  is  solved  by  a  nonlinear  SOR  algorithm  (Cryer  1971)  which  is  described 
in  section  three. 
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Numerical  experiments  are  presented  in  sections  four,  five  and  six,  and  these 
experiments  were  chosen  to  demonstrate  the  feasibility  of  realistic  two  dimensional 
simulation  of  porous  flows.  Later,  we  indicate  how  one  can  extend  these  methods  to  three 
dimensional  and  heterogeneous  porous  flows.  We  show  agreement  of  the  numerical 
model  with  the  field  data  from  Brindabella  silty  loam  soil.  Also,  we  show  how  one  can 
develop  comparison  results  which  deal  with  the  uncertain  empirical  data.  High 
performance  computing  issues  are  described.  Here  we  demonstrate  that  the  algorithm  in 
section  three  does  not  vectorize  well,  but  it  does  work  well  for  multiprocessors  when 
domain  decomposition  methods  are  used.  Finally,  we  state  our  conclusions  and  related 
work. 


2.  DISCRETE  VERSION  OF  RICHARDS'  EQUATION 

In  this  section  we  state  the  finite  difference  discretization  of  Richards'  equation  and 
make  some  comparisons  with  the  Stefan  problem.  If  in  equation  (la)  the  last  term  is 
eliminated,  and  h  were  to  represent  temperature  with  K  now  denoting  the  thermal 
conductivity  and  6  the  enthalpy,  then  this  would  be  the  enthalpy  formulation  of  the  Stefan 
problem  (White  1985).  In  the  Stefan  problem  the  K  and  0  have  jump  discontinuities  at  the 
phase  change  temperature.  SOR  methods  can  be  effectively  used  provided  the 
overrelaxation  is  not  applied  during  a  cell's  phase  change. 

In  Richards'  equation  we  will  approximate  K  and  6  by  piecewise  linear  functions 
which  could  be  viewed  as  a  number  of  "linear  phases"  associated  with  the  nonlinear  flow. 
As  in  the  Stefan  problem  we  will  apply  the  SOR  method  provided  the  cell  is  not  changing 
"linear  phase."  Table  1  gives  some  data  for  Brindabella  loam  which  was  extrapolated 
from  the  graphs  in  White  and  Broadbridge  (1988).  Note,  both  0(h)  and  K(h)  are 
monotone,  and  K(h)  has  large  derivatives. 


Table  1: 


Brindabella  Data 


■ 

■m 

KEm^m 

-8.0 

0.11 

0.0 

-1.2 

0.27 

0.000  118 

-0.8 

0.30 

0.000  327 

-0.7 

0.31 

0.000  457 

-0.6 

0.32 

0.000  664 

-0.5 

0.33 

0.001  138 

-0.4 

0.34 

0.001  693 

-0.3 

0.36 

0.002  326 

-0.2 

0.38 

0.004  568 

-0.1 

0.42 

0.011  117 

-0.0 

0.485 

0.118000 

In  Silva  Neto  and  White  (1993)  two  "linear  phases"  were  used  and  were  coupled 
with  the  Kirchhoff  change  of  dependent  variable.  Although  this  was  a  crude 
approximation  of  the  data,  it  did  track  the  wet-dry  interface  where  a  rapid  transition  from 
unsaturated  to  saturated  regions  occurs.  In  cases  where  either  the  transition  is  not  rapid 
or  there  is  space  dependence  of  the  data,  the  Kirchhoff  transformation  is  not  applicable. 
In  the  following  we  use  an  implicit  time  discretization  of  Richards'  equation. 

<Kh)~  M)  -  (K(h)h, ),  -  (K(h)h  ),  -  Kih)  =  0  (2) 

A  t 

where  h  is  known  from  the  previous  time  step.  Here  we  are  in  two  space  dimensions,  and 
y  is  the  vertical  direction. 

Next  we  discretize  the  space  variable  by  the  finite  difference  method,  in  the 
calculations  that  we  later  discuss,  we  consider  a  two  dimension  flow  with  zero  flow 
through  the  sides  and  bottom,  and  nonzero  flow  through  the  top.  The  finite  difference 
grid  is  illustrated  in  Figure  1  where  the  nine  types  of  boundary  cells  are  indicated.  Here 
there  are  N  =  3  cells  in  each  direction  and  =  9  unknowns. 


(K(h)h,-K(h))  =R 


-(K(h)hy+  K(h))=0 

Figure  1:  Finite  Difference  Grid 


Let  (ij)  denote  the  location  in  the  finite  difference  grid.  Then  the  general  form  of 
the  finite  difference  equation  at  this  location  is 

du  ~cuhu  =  r<M  lzi<niandl<j  zn7.  (3) 
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In  the  above  equation  we  used  the  convention  that  A, is  the  average  of  the 

hydraulic  conductivity  at  the  appropriate  surrounding  nodes  We  also  will  assume  that  the 
surrounding  nodes  are  evaluated  at  a  "previous  iteration"  value  Thus  equation  (3)  is  a 
piecewise  linear  system  as  illustrated  in  Figure  2.  Both  the  piecewise  linear  approximation 
of  the  moisture  content  and  hydraulic  conductivity  functions  are  monotone,  and  so,  T 
must  also  be  monotone  nondecreasing.  Since  the  term  on  the  left  side  of  equation  (3)  has 
negative  slope,  equation  (3)  has  a  solution,  and  it  is  unique.  In  Figure  2  the  data  is 
depicted  as  being  continuous,  but  this  need  not  be  the  case.  Even  if  T(h)  has  a  jump  and 
remains  nondecreasing,  one  can  still  solve  for  a  unique  h  (Silva  Neto  and  White  1993). 


3.  NONUNEAR  SOR  ALGORITHM 

The  following  algorithm  has  evolved  from  the  work  by  Cryer  (1971)  for  set  valued 
systems  of  equations  that  may  come  from  models  of  the  Stefan  problem.  We  apply  a 
variation  to  the  system  given  in  (3).  The  following  variables  are  used: 

maxit  =  number  of  allowable  SOR  iterations  per  time  step, 
nz,ny=  number  of  cells  in  the  x  and  y  direction. 

To,  &  =  overrelaxation  (larger  than  1 .0)  and  underrelaxation  (less  than  1 .0) . 


X 


Nonlinear  SOR  Algorithm 
for  k  =  l,maxit 

for  i  =  1,11, 

for  j  =  1  ,ny 

compute  ct  J  and  d;j  as  given  in  (3) 

solve  for  h  in  (3)  as  given  in  Figure  2 
if  h  and  h{ }  are  in  the  same  linear  phase,  then 

hu  =(l-5)hy  +7oh 
else 

h4J  =  (l-#,j  +  ah 

endif 
end  loop j 
end  loop  i 

test  for  convergence 
end  loop  k. 

In  the  computation  of  the  c  and  d  values  one  must  consider  the  nine  types  of  cells 
as  indicated  in  Figure  1.  For  more  complicated  geometric  configurations  and  for 
heterogeneous  porous  media,  this  will  be  more  complicated.  If  adjacent  cells  have 
different  moisture  content  and  hydraulic  conductivity  data,  then  one  must  insert  additional 
nodes  between  the  cells  and  demand  continuity  of  the  flow  velocity  at  the  interface  (White 
et  al.  1993). 

In  the  solve  step  one  must  determine  which  linear  phase  the  solution  is  in,  and  this 
is  done  by  partitioning  the  vertical  axis  as  indicated  in  Figure  2  by  the  dotted  lines  that  are 
parallel  to  the  line  given  by  d-ch.  Hence,  the  solve  step  has  a  loop  in  it  which  was  not 
indicated  above.  This  hidden  loop  contains  the  nonlinear  nature  of  the  solve  step. 
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Moreover,  if  there  is  a  large  number  of  linear  phases,  then  the  solve  step  will  become  more 
expensive  to  compute 

The  overrelaxation  is  used  to  reduce  the  number  of  outer  iterations,  and  the 
optimal  choice  will  vary  with  the  number  of  unknowns  The  underrelaxation  is  used  to 
avoid  numerical  oscillations,  and  this  works  well  for  choices  between  0.8  and  0.9  The 
numerical  oscillations  are  a  result  of  passing  from  one  linear  phase  to  the  next  linear  phase 
This  deals  with  the  large  derivatives  of  the  data  by  breaking  the  changes  in  slopes  into  a 
number  of  smaller  changes  in  slope.  We  found  this  to  be  much  more  effective  than  the 
traditional  method  of  reducing  the  time  step. 

We  experimented  with  a  number  of  convergence  tests.  Finally,  we  imposed  two 
conditions: 

(i)  max|  new  h  -  old  h  |  ^  et  and 

(ii)  JJ  |  new  9  -  old  9\  <  e2. 

The  first  condition  is  aimed  at  possible  convergence  of  the  pressure  at  each  node.  The 
second  condition  reflects  possible  convergence  of  the  total  moisture,  and  it  is  more  of  a 
global  test  than  the  first  condition. 

4.  COMPUTATIONS  FOR  BRINDABELLA  LOAM 

The  purpose  of  these  computations  is  to  see  if  our  model  of  Richards'  equation  will 
accurately  track  the  movement  of  moisture  through  Brindabella  loam.  We  compare  our 
calculations  with  the  observations  in  White  and  Broadbridge  (1988).  In  our  numerical 
model  we  considered  a  0.3[m]  x  0.3[m]  region  with  boundary  conditions  as  indicated  in 
Figure  1 .  In  the  top  boundary  we  used  R  =  0.0165[m/hr],  and  the  initial  pressure  was  set 
as  h  =  -8.0[m].  The  moisture  content  function  was  a  linear  interpolation  of  the  data  in 
Table  1.  The  hydraulic  conductivity  data  indicates  a  very  increasing  and  concave  up 
function;  consequently,  linear  interpolation  of  the  data  would  generate  large  errors.  In  the 


calculations  presented  in  Figure  3a  we  used  a  linear  interpolation  of  the  modified  data  for 
hydraulic  conductivity  in  Table  1;  we  reduced  the  interior  values  by  50  percent  and  kept 
the  two  end  values  at  0.0  and  0. 1 18 

In  our  computations  we  set  the  following  parameters  at 
At  =  0. 125[hr]  Ax  =  0.015[m] 

N  =20[cellsineachdir.]  R  =  0.0165[m/hr] 

g>  =  0.9  td  =1.4 

ej  =  10-4  £2  =  10"®. 

Convergence  was  usually  attained  in  20-40  iterations.  If  no  underrelaxation  was  used, 
then  numerical  oscillations  would  occur  about  once  every  20  time  steps. 

During  the  initial  times,  the  hydraulic  conductivity  is  small,  and  Richards'  equation 
is  dominated  by  wave  like  properties.  As  time  progresses  the  hydraulic  conductivity 
increases  so  that  Richards'  equation  is  dominated  by  diffusion  At  time  6.0[hr]  the  steady 
state  solution  has  essentially  been  reached.  At  the  bottom  (y  =  300[mm])  the  porous 
media  is  at  saturation  (6  =0.485).  At  the  top  (y=0[mm]>  the  porous  media  has  pressure 
such  that  K(h)  =  R  (6(h)  =0.426).  At  this  time  the  diffusion  force  is  equal  to  the 
gravitational  force;  hence,  no  more  moisture  can  enter  the  porous  media. 


II 
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Figure  3a:  Computed  Moisture  for  Variable  Times 

The  observed  moistures  are  indicated  by  discrete  points  in  Figure  3b.  These  were 
for  the  times  of  1.0  (diamonds),  2.25  (triangles)  and  5.0  hours  (squares).  The  computed 
values  give  good  agreement  with  the  observed  values.  The  computed  moisture  content 
curves  are  somewhat  more  smoothed  than  the  observed  moisture  content  data.  This  may 
be  attributed  to  large  hydraulic  conductivity  data;  if  one  reduces  the  hydraulic 
conductivity  for  smaller  pressures,  then  a  sharp  front  can  be  calculated  to  match  the 
observed  moisture  content  data. 
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Figure  3b:  Observed  Moisture  Data  for  Variable  Times 


5.  COMPUTATIONS  WITH  UNCERTAIN  DATA 

This  section  contains  an  analysis  of  the  moisture  as  a  function  of  the  empirical 
data  of  the  moisture  content  and  hydraulic  conductivity  functions.  In  practice  much  of  this 
data  is  not  precisely  known,  and  therefore,  the  effects  upon  the  computations  from  any 
model  will  have  some  uncertain  aspects.  In  our  numerical  experiments  we  decreased  K 
and  the  computed  moisture  at  the  top  increased.  We  also  increased  0  and  the  computed 
moisture  at  the  top  increased  In  the  computations  indicated  in  Figure  4,  for  time  equal  to 
1  25[hr],  we  decreased  K  and  increased  6,  and  the  largest  computed  moisture  content  at 
the  top  was  the  result.  Here  we  kept  the  data  at  the  end  points  of  Table  1  fixed  and  varied 
the  interior  data  by  increments  of  20  percent.  In  all  computations  the  computed  moisture 
content  at  the  top  increased  while  the  computed  moisture  content  at  the  bottom 
decreased.  This  happens  because  the  sides  and  bottom  do  not  permit  flow  through  them, 
and  the  total  moisture  must  remain  constant. 
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Figure  4:  Moisture  end  Variable  Data 

In  order  to  gain  some  insight  on  this,  h  is  instructive  to  examine  the  finite 
difference  equation  at  the  top  region.  In  the  case  of  the  top  center  nodes,  d  has  the  form 
2r  K('J  _  )  _  - 

d  = - +  d  where  d  has  form  similar  to  that  given  in  (3). 

a y  Ay 

Figure  5  shows  that  if  T  increases,  then  the  solution  of  (3)  will  decrease.  Also,  if  d 
decreases,  then  the  solution  of  (3)  will  decrease.  Therefore,  if  both  T  increases  and  d 
decreases,  then  the  solution  of  (3)  will  decrease.  If  K  decreases,  then  d  will  increase  and 
T  will  decrease.  However,  if  both  K  decreases,  and  0  increases  enough,  then  T  will 
increase.  For  our  choices  of  At  and  Ay  this  is  the  case.  Of  course,  this  is  just  an  analysis 
at  one  grid  point,  and  the  argument  requires  much  more  careful  discussion. 


- 1  4K,  6T 

1.2K,  8T 
1.0K,  1.0T 

-  8K,  1.2T 

-  ,6K,  1.4T 
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6.  COMPUTATIONS  USING  MULTIPROCESSORS 

In  the  computations  reported  in  this  section  we  tried  to  implement  the  above 
algorithm  on  a  single  CPU  with  vectorization  on  a  Cray  Y-MP,  the  Alliant  FX-40  with 
two  vectorized  CPUs,  and  the  Kendall  Square  Research  KSR1  with  up  to  16  CPUs  and  no 
vectorization.  In  the  calculations  in  Silva  Neto  and  White  (1993)  the  vectorization 
methods  did  not  seem  to  work  well.  These  attempts  involved  reordering  the  nodes  by  the 
red-black  order  (checker  board  order).  This  method  also  did  not  work  well  for  our 
current  problem.  The  reason  for  this  is  that  the  inner  most  loop  has  computations  which 
are  too  complicated  to  effectively  be  done  on  a  vector  pipeline. 

The  multiprocessing  approach  with  domain  decomposition  reordering  (White 
1987.  or  Ortega  1988)  was  much  more  promising.  This  reordering  is  depicted  in  Figure  5 
where  L  =  4  (the  number  of  larger  blocks  of  nodes)  and  the  classical  order  of  the  blocks  of 
grid  points  is 

P  P  P  P  P  P  P 

The  domain  decomposition  order  lists  all  the  smaller  interior  boundary  blocks  first  (even 
number  blocks  in  Figure  6)  and  is 
P  P  P  P  P  P  P 
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The  idea  behind  this  reordering  is  to  take  advantage  of  the  5-point  finite  difference  pattern 
Once  the  calculations  in  the  even  blocks  have  been  done,  then  the  calculations  in  the  large 
odd  blocks  are  independent  of  one  another 


Figure  6:  Domain  Decomposition  Order 


for  k  =  l,maxit 

concurrently  do  SOR  over  the  even  blocks 
update 

concurrently  do  SOR  over  the  odd  blocks 
test  for  convergence 
end  loop  k. 


In  our  calculations  we  used  the  Kendall  Square  Research  multiprocessing 
computer,  KSR1,  which  is  operated  by  the  North  Carolina  Supercomputing  Center.  The 
KSR1  multiprocessing  computer  has  three  parallel  constructs  that  can  be  used  in 
FORTRAN  code:  tile,  parallel  section  and  parallel  region.  Tile  is  used  to  partition  loops 
and  is  very  effective  for  simple  computations  such  as  matrix  multiplications.  Parallel 
section  can  be  used  to  concurrently  execute  different  code  segments.  We  used  parallel 
region  which  duplicates  a  code  segment  and  uses  different  data  streams.  In  our 
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computations  we  controlled  the  number  of  processors  by  using  a  team  of  processors  that 
are  assigned  at  the  beginning  of  the  code  and  are  used  to  reduce  parallel  overhead 

Table  2  shows  the  speedup  and  efficiency  for  a  variety  of  L  (the  number  of  large 
blocks)  and  N  (the  number  of  cells  in  each  direction)  These  quantities  are  defined  as 
follows: 

Sl  =  (CPU  time  using  one  block)/(CPU  time  using  L  large  blocks)  and 

El  =  sl/l 

In  the  first  four  rows  N  varies  and  L  is  fixed.  We  see  increased  speedup  and  efficiency  as 
N  increases.  This  is  a  result  of  decreased  parallel  overhead.  In  the  last  four  rows  N  is 
fixed,  and  L  is  increased.  Here  the  speedup  increases,  but  the  efficiency  decreases.  Of 
course,  if  there  are  many  larger  blocks  (L)  and  the  number  of  cells  in  each  direction  (N) 
remains  the  same,  then  the  relative  size  of  the  larger  block  to  the  smaller  block  decreases. 
This  partially  accounts  for  the  decreased  efficiency.  These  calculations  did  not  attempt  to 
make  the  most  efficient  use  of  the  FORTRAN  language,  or  the  most  efficient  use  of  the 
KSR1  computer's  architecture. 

Table  2:  Speedup  and  Efficiency 


N 

L 

Sl. 

Er 

20 

2 

1.56 

0.78 

40 

2 

1.68 

0.84 

80 

2 

1.75 

0.88 

160 

2 

1.78 

0.89 

160 

4 

3.16 

0.79 

160 

8 

5.09 

0.64 

160 

16 

7.29 

0.46 
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7.  CONCLUSIONS 

Richards'  equation  was  approximated  by  the  finite  difference  method,  and  the 
empirical  data  for  the  moisture  content  and  the  hydraulic  conductivity  were  approximated 
by  piecewise  linear  functions.  The  resulting  nonlinear  algebraic  system  was  solved  by  a 
variation  of  the  nonlinear  SOR  iterative  method.  Good  convergence  properties  were 
observed  for  three  types  of  calculations  which  were  chosen  to  demonstrate  the  feasibility 
of  realistic  numerical  simulations  using  this  method.  These  included  an  accurate 
simulation  of  fluid  flow  in  Brindabella  loam,  a  sensitivity  analysis  of  the  computed  solution 
upon  the  empirical  data,  and  the  use  of  multiprocessing  computers  via  domain 
decomposition  methods. 

In  the  above  calculations  the  empirical  data  did  not  have  a  space  dependence. 
However,  in  White  et  al.  (1993)  we  illustrated  for  a  steady  state  and  one  space  dimension 
version  of  Richards'  equation  that  the  compact  volume  method,  in  place  of  the  finite 
difference  method,  could  be  effectively  used  for  such  heterogeneous  problems.  The 
compact  volume  method  can  be  viewed  as  an  enhanced  finite  difference  method  where 
additional  nodes  are  inserted  at  the  cell  interface  and  additional  equations  are  generated  by 
requiring  continuity  of  the  fluid  velocity  at  these  interfeces.  This  may  be  done  for  all  cell 
interfaces  or  for  just  those  cells  where  the  empirical  functions  change  with  respect  to  the 
space  variable.  We  expect  the  methods  of  this  paper  to  generalize  via  the  compact  volume 
method  to  the  more  complicated  heterogeneous  case. 
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Abstract 


We  consider  both  the  compactfinite  volume  and  finite  difference  space  discretizations  of  the  Stefan  problem. 

The  resulting  algebraic  systems  are  solved  by  nonlinear  versions  of  ADI  and  SOR.  Both  algorithms  contain 
significant  parallelism  which  is  demonstrated  on  two  vector! multiprocessing  computers,  the  Alliant  FXI40  and 
the  Cray  Y-MP.  Numerical  experiments  indicate  that  the  compact  discretization  and  ADI  give  the  best  accuracy 
with  the  minimum  computational  cost. 


Introduction 

■  This  report  outlines  some  new  numerical  methods  for  the 
solution  of  multiple  space  dimension  Stefan  problem.  This 
will  include  both  SOR  and  ADI  algorithms  Tor  the  nonlinear 

I  algebraic  systems  which  result  from  both  the  FDM  and 
compact  space  discretizations  melhos.  Analysis  of 
convergence  will  be  discuscd. 

I  The  ADI  algorithms  have  been  introduced  because  they 
appear,  in  many  examples,  to  be  more  effective  than  some  of 
the  traditional  algorithms  related  to  the  Stefan  problem. 

I  Moreover,  these  ADI  algorithms,  as  in  the  linear  parabolic 
problems,  have  large  number  of  independent  tridiagonal 
solvers.  Therefore  they  can  be  effectively  implemented  on 
vcctor/multiprocessing  computers.  This  is  illustrated  for 
■  Alliant  FX/40  and  the  Cray  Y-MP.  . 

™  In  this  paper,  wc  will  use  the  enthalpy  formulation  of  the 
Stefan  problem.  (See  Rose)3,  Elliot2  and  White5). 

|  I.  E,-AP(£)  =  /on  Dx(0,T) 

E(x,0)  =  given  on  D 

|  p(E(r,t))  =  given  on  dDx(0,7) 

where 

IP(£)=Kirchoff  transformed  temperature 

p(E)  -  u  is  the  "inverse"  of  the  enthalpy  function 
E  =  H(u). 

■  Typical  values  are  (See  Williams  and  Wilson7) 
u=270 

i 

I  a  =  l.OE+2 


a^  *  l.OE+3  (Degrees  in  Kelvin) 


|  ‘This  research  was  sponsored  by  the  Air  Force  Office  of  Scientific 
Research,  Bolling  Ah  Force  Base,  D  C  Contract  No.  F49620-89-C- 
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tract  No.  DAA103  90  G-0 126. 
j  I.  NC  State  University,  Raleigh,  NC 
INC  At  T  State  University,  Greensboro,  NC 


Stefan  Number  -  1  =  (u  -  u^/L 

For  llicse  typical  values  and  the  range  of  u,  it  is  important  to 
note  that 

P  (£)/£  is  Lipschilz  continuous  and 

0£  mS  (P (£))/££  M  where 

m  =  p  (27000))/27000  -  270/27000=  1.0E-2 

M  =  P  (27030))/27030  -  285/27030=  I.054E-2 

The  difference  M-m  =  5.4E-4  is  relatively  small  and  will  be 
important  in  a  mild  constraint  on  the  time  interval  =  A  /. 

Problem  (I)  may  be  discretized  in  the  lime  variable  either 
explicitly  (see  Athcy1)  or  implicitly  (see  White5).  In  the 
former,  (his  is  a  stability  condition  on  A  /,  but  there  is  not  a 
nonlinear  system  to  solve.  The  implicit  time  discretization 
yields  no  stability  constraint  on  A  T,  but  it  does  give  a 
nonlinear  system  to  solve.  In  White5  and  Elliot2  nonlinear 
SOR  methods  have  been  used  to  approximate  this  system 
which  has  the  form 

2.  E+A/AP(E)  =  d 

Where  A  is  an  M-mairix  associated  with  -A. 

Later  in  White6,  a  modified  version  of  (2)  was  studied  so  that 
more  traditional  linear  solvers  could  be  used  in  the  solution 
of  (2).  The  component  from  of  (2)  is 

n 

£.+  At  X  0;.,  P(E,)=4 


£,+  At 


Let  A(E) 


"  (  P  (£.)) 

°i,i  -£?-  Ei  -  di 
j-  •  1  ) 

f  P  («/)' 

=  o;  ;  — and  consider  the 

l  '  EJ  J 


and  consider  the  vector  form  of  (3) 


E  +  A  t  A(£)  £  =  d 
/+  A  (A(£))£=  d 
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For  A  i  somewhat  icstriclcd  and  the  above  typical  values, 
IrAM(E)  will  be  strictly  diagonally  dominant,  and 
therefore,  nonsingidai.  1  his  allows  iis  lo  dcliuc  the  following 
algorillini: 

5-  Em+1  =  (/  +  A/  ME”'))"'  d 

The  convcrgctice  of  (5)  was  established  by  showing  for 
suitably  small  At  that  lire  map  was 

G{E)  m  (I  +  Aty^(en))-id 

contractive.  However,  numerical  experiments  on  a  one 
variable  Stefan  problem  showed  tliat  At  constraint  note  to  be 
too  severe. 

This  study  extends  the  solution  of*1*  and  *4*  to  higher 
dimensions  by  considering  AOf  splitting  associated  with 
A(E). 


ADI:  One  II  and  V  sweep  per  time  step 

Let  A  =  H  +  V,  wlterc  H  and  V  arc  associated  with  horizontal 
grid  rows  and  the  vcrucal  grid  columns  of  -  A.  Then  we  define 


//(E)  «//(E)  = 
V(E)  m  V(E)  = 


hs  i 


P  <Ej) 


Ej 

P  (£/) 

i».  .  - 

E; 


/>(E)  =^  +  A///(E) 

0(E)  =  |  +  AtV(E) 

Note  I  +  At  A(E)  =  /)(E)+P(E),  and  for  mild  constraint  on 
At ,  /)(£)  andO'(E)  arc  nonsingular  and  triadiagonal. 

This  suggests  flic  following  nonlinear  verion  of  the  Pcaccman 
-Ranchford  ADI  algorithm  for*1*: 


6.1 


rk+l/2 


-  £*  .  r: 


A//2 


+  77(E*+1/2)£*  +  I/2  =  /  +  ,/2-  V(Ek) 


6.2 


V(E*+,)E*+,  =  /+1-  // (E* +  ,/2) 


At  each  lime  step,  k,  (6.1)  and  (6.2)  gives  a  nonlinear  system 
to  solve: 

7.1  /7(E*+1/2)£*+,/2  =  t/*+,/2-  b (E*) E* 

7.2  P(E*+l)Et+,=  d*41  -  />(E*+i/2)£*+,/2 

It  is  useful  to  incorporate  the  ADI  acceleration  parameter  a 
which  will  also  alow  as  lo  avoid  any  further  constraint  on  At: 

A(E)  =  (a/+  fl(E))~  (a/-  P(£)) 

Then  (7.1)  and  (7.2)  give 


8.1 

(a /  +  />(£* 4  ,/2))E*4,/2  =  ci*4  1/2  i  (a/  -  f\C*4  l/2))E* 
8.2 

(a/+  (E*  4  1 ))  E*  +  1  =  d*+,+  (a/  -  I)  (E**  ,/2))  £* 4  1/2 

Both  (8.1)  and  (8.2)  can  be  solved  ns  in  (5).  For  example, 
consider  (8.1)  with  k  +  1/2  suppressed  and  in  equals  llic 
iteration  index. 

Consider 

9  (a/+  fl(E)d 

/)(£") j  d 

Proposition  1: 

Let  A,  A(E),  fl  (E),  (*  (E)  be  as  above.  If  A  is  an  M~mairix 
and  P(E)  is  as  defined  above,  then  A(E),  fl  (E),  (E)  arc 

non-singular  for  small  At.  Moreover,  there  exists  a„>  0  such 
dial  for  a  £  ao*10*  converges  lo  die  unique  solution  of  ***. 

The  proof  follows  dial  given  in  While*6*  for*4*  and  *5*. 

In  practice  a  is  used  as  an  acceleration  parameter  and  usually 
only  3-5  iterations  arc  required  to  solve  (8.1)  or  (8.2).  The 
schemes  in  (8. 1 )  and  (8.2)  do  not  solve  die  implicit  time  step: 

11.  A(E*+1)E*+,  =  /  +  l 

For  each  fixed  problem  in*11*  there  arc  a  number  of  ADI 
schemes  with  multiple  H  and  V  sweeps.  We  focus  on  only 
one  in  the  next  section. 

ADI:  Multiple  II  and  V  sweeps  per  time  stop 
Consider  die  simple  nonlincrar  system*4*  with 
A(E)  E  *  77  (E)  +  V  (£),  or 

12.  (/)(£)+  P(E))E=  d 

After  linearization  the  multiple  11  and  V  sweep  ADI 
algorithm  is 

13.1 

(«/  +  /)  (Emo)) Em+ 1/2  =  d+  (a/  -  f*(Em»))Em 

13.2 

(a /  +  P(Emo))Em+,=  d+  (a/-  />(£"»)) E”+ 1/2 
One  can  solve  (13.1)  and  insert  it  into  (13.2)  lo  obtain 

14.  E"*4  1  =  6  (E"1®)  +  //  (E*®)  Em 
If  E"*4 1  converges,  define  £'”• +  1 

provided  die  spectral  radius  of  II  (£*" •)  is  less  dtan  I .  This  is 

15.  ETo 4 1  =  (/  -  //  (£'n°)f1  5  (E*1®) 


10  Er+'=  (a/  +  I)  (ET))~ld 


-H 
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me  (15)  forms  the  outer  iteration  of  our  algorithm  i  solve 

(2). 

Proposition  2 

Consider  (12)  and  the  algorithms  (13)  -  (15).  Let  the  same 
assumptions  hold  as  in  proposition  1.  Then  dicrc  exists  a 
n0  >  0  such  dial  if  a  £  cta,  the  inner  iterations  ( 1 4)  converge 

to  E1"®  4  1  in  (15),  and  £"u,+  1  converges  to  the  solution  in 
(12). 

The  proof  requires  the  spectral  radius  ofll(E)  to  be  uniformly 
bounded  below  t.O.  This  step  makes  use  of  monotoniciiy 
properties  (nonnegalive  matrices),  and  it  contrasts  widt  the 
linear  ease  where  symmetric  positive  definite  matrices  arc 
used. 

The  above  schemes  is  not  the  only  way  to  do  multiple  H  and 
V  sweeps,  but  it  docs  allow  one  to  give  a  convergence 
analysis.  The  best  version  of  multiple  H  and  V  sweeps  is  not 
yet  clear.  The  numerical  experiments  that  we  have  done 
indicate  that  tire  Pcaccman-Ranchford  method  (one  H  and  V 
sweep  per  lime  step)  is  adequate. 

Numerical  Experiments 

In  this  section  we  consider  the  numerical  solution  of 
E,  -  A  p  (E)  =  0 

where  (lie  domain  D  =  (0,1)  x  (0,1)  x  (0.1/V2) 
and 

e«- :£<*+»_  ,  t-  4-(x  +  y)2  0 

£= 

-l,  «-■£(*+*)<  0 

The  space  variables  arc  discretized  in  two  ways,  the  compact 
finite  volume  (CPT)  (See  Rose4),  and  the  traditional 
five-point  finite  difference  method  (FDM).  The  resulting 
algebraic  problems  arc  solved  by  ADI  (one  II  and  V  sweep 
per  lime  step)  and  by  SOR.  The  computations  were  done  on 
two  vcctor/mulliproccssing  computers:  the  Allianl  FX/40 
and  the  Cray  Y-MP.  The  Allianl  FX/40  has  two  processors 
each  with  a  vector  pipeline.  The  Cray  Y-MP  was  used  as  a 
single  processor  with  a  more  sofisticatcd  vector  pipeline  and 
muh  shorter  clock  cycle  time  (4  nsec  as  compared  to  170 
nsec). 

The  CPT  space  discretization  has  a  great  deal  of  parallelism 
which  can  be  executed  on  either  vector  pipelines  or 
multiprocessing  architectures.  The  unknowns  at  the  center  of 
the  cells  maybe  grouped  together,  and  the  unknowns  at  the 
boundary  of  the  cells  grouped  together.  The  resulting 
coefficient  matrix  has  the  form  of  a  two  coloring  scheme. 
Hence  one  can  execute  the  SOR  scheme  for  the  CPT 
discretization  on  vector  pipelines.  Also  the  CPT  space 
discretization  when  solved  by  ADI  methods  will  have  a  large 
number  of  independent  lridiagonal  solvers.  Thus,  the  vector 
version  of  die  lridiagonal  algorithm  may  be  used. 


In  the  ease  of  FDM  space  discretization,  the  traditional  red- 
black  ordering  of  nodes  allows  veclorizaiion  of  die  SOR 
algorithm.  Also,  domain-decomposition  mcdiods  have  been 
used  on  SOR  so  dial  the  two  processors  of  the  Allianl  FX/40 
can  be  effectively  used.  Of  course,  we  can  use  the  vector 
version  of  the  lridiagonal  algorithm  to  solve  the  FDM  when 
ADI  is  used. 

Table  1  CPT- ADI 


Cells 

16^ 

322 

642 

Computers 

Allianl,  serial 

2.01 

15.64 

122.28 

Allianl,  vector 

1.02 

5.98 

38.63 

Alliant,  vector,  2  processors 

0.57 

3.32 

21.07 

Cray,  serial 

0.129 

0.89 

6.52 

Cray,  vector 

0.062 

0.32 

1.76 

Table  2  CPT-SOR 

Cells 

162 

322 

642 

Computers 

Alliant,  serial 

4.68 

74.44 

1 155.00 

Alliant,  vector 

2.67 

30.72 

508.81 

Allianl,  vector,  2  processors 

1.41 

15.99 

270.91 

Cray,  serial 

0.41 

6.23 

96.24 

Cray,  vector 

0.11 

1.06 

11.34 

Table  1  contains  computing  time  done  for  die  compact 
discretization  and  using  one  sweep  ADI  algorithm.  Both 
computers  were  effective  vcctorizcrs.  Table  2  contains 
computing  limes  done  for  die  compact  discretization  with 
red-black  ordering  SOR  algorithm  where  ox=1.2 


Table  3  FDM-SOR 


Cells 

162 

322 

642 

Computers 

Alliant,  serial 

0.82 

12.93 

186.72 

Allianl,  vector 

0.89 

10.29 

124.97 

Alliant,  vector,  2  processors 

0.51 

5.58 

67.35 

Cray,  serial 

0.069 

1.05 

.  15.48 

Cray,  vector 

0.043 

0.42 

4.19 
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The  third  table  contains  computing  times  for  the  FDM  with 
red-  black  ordering  for  SO R  where  o>=1.2.  This  method 
requires  more  iterations  to  reach  convergence  than  docs  die 
CfT-  SOR  method. 

All  three  methods  give  good  accuracy.  The  CPT-AD1  seems 
to  be  more  accurate  and  less  computing  time  required  for  this 
and  related  examples. 
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Typical  values  are  (See  Williams  and  Wilson  [7]) 
uf  *  270 
a,  -  1.0E+2 

«2  »  l.OE+3  (Degrees  in  Kelvin) 

L-15 

Stefan  Number  -  1  =  (u  -  uf)/L 

For  these  typical  values  and  the  range  of  u,  it  is  important  to  note  that 
(p(E))/E  is  Upschitz  continuous  and 
OSmS  (P(E))/E  S  M  where 
m  -  (P(27000)V27000  -  270/27000  -  i.OE-2 
M  -  (P(27030))/27030  -  285/27030  *  1.054E-2 
The  difference  M  -  m  ■  5.4E-4  is  relatively  small  and  will  be  important  in  a  mild  constraint 
on  the  time  interval  *  At. 
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Abstract 

Recently  developed  Compact  Finite  Difference  scheme 
(CPT)  is  applied  to  two  dimensional  diffusion  equations. 
The  relative  merits  of  CPT-ADI  are  investigated  with  other 
computational  schemes  such  as  finite  difference  method  - 
ADI  (F DM -ADI)  and  FDM-SOR.  The  numerical  results  ob¬ 
tained  from  these  three  approaches  are  compared  to  known 
analytical  solutions.  The  primary  interest  of  this  study  lies 
on  vectorization  and  parallel  processing.  According  to  our 
results  shown  in  tables  1  CPT-ADI  is  found  to  be  superior 
scheme  with  regards  to  accuracy,  than  both  FDM-ADI  and 
FDM-SOR.  It  is  also  fastest  algorithm  than  both  FDM-ADI 
and  FDM-SOR  as  it  is  evident  from  CPU  times. 


1.  Introduction 

We  shall  describe  briefly  the  compact  finite  difference 
method  (CPT)  for  one  dimmensional  steady  state  problem. 
The  extension  to  2-D  problem  may  be  easily  done.  The 
underlying  physics  behind  this  approach  lies  in  solving  the 
differential  equation  in  isolation  from  its  neighboring 
subintervals  (i.e.  compactly).  Then,  extend  the  solution  in 
the  large  by  means  of  continuity  conditions  for  the  flux  and 
temperature  accross  the  boundaries  of  the  contiguous 
subintervals  (See  Rose  [1]). 

We  consider  here  one  dimensional  steady  diffusion 


problem: 

(1) 

D.E. 

u'  =  f 

(2) 

f'  =  u 

(3) 

B.C. 

•"•t 

II 

oo 

for  X  e  I  and  I  * 

IL.H 

Divide  the  interval  I  into  m  non overl aping  subintervals: 

*  This  wndi  ni  sponsored  by  the  Air  Force  Office  of 
Scielifk  Research,  Bolling  Air  Force  Base,  D.  C.  Contract  No. 
F49620-W-C-0010  and  Army  Research  Office,  Research  Triangle 
Auk.  N.  C.  Contract  No.  DAAL03-90-G-0126. 


with  center  points  xj,  j=  1/23/2,...,  M  -  1/2  and  interior 
endpoints  xj,  j=l,2,...,  M  -1.  We  shall  adopt  the  finite 

difference  notations  Axi  =  xi  + 1/2 '  xi  .1/2*  *4  *  Ax/2  and 
u(i)  *Uj  We  also  denote: 

Ip  a  { 1/2, 3/2,...,  M  -  1/2}  (center points) 

Ie  =  { 1,2,...,  M  - 1 }  (interior  endpoints) 


1 - 

“• - H - - - 

- 1 

^  fj«, 

Fig.  1. 

The  discrete  equations  for  (1)  and  (2)  can  be  written  as,  for 
jelc. 

(4) 

uj+JL"Uj.i  =  Axj  fj 

2  2 

(5a) 

=  hj  Uj+i 

and 

(5b) 

2  *  2 

Next,  it  is  required  that  both  u  and  f  be  continuous  across 
every  endpoint  common  to  two  intervals: 

[u]i=[4  =  0  for  i €  Ie. 
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Using  this  continuity  condition,  (uji  s  0  in  terms  of  $  we 

have  (8a) 

<>i  -  <>i .  i  <>i  + 1  - 
_ 1  = _ 2 _ 

hi-^-  h»  +  ^-  (8b) 


and  in  the  case  of  equal  intervals,  which  reduces  to 


n  +  J- 


Un  1 

‘•J  n  +  i 

- =  F  2  +G? 


>o 


‘•J 


,(t+  1 


n  +  i- 

Xi  ;  2  n  + 1  . 

±1—  =  F  2+Gf*1 
>.j  ° 


(6) 


$i  +  l+  $i-i 

s  — 2__ - 2.,  i  6  Ie  (endpoints) 


Now  from  (4)  and  (S)  we  get, 

(7a)  ^j+l-2  4>j  +  ^j.l  =  2h^fj,  ielc 

2  2  J 

and  from  (6) 

(7b)  ^j+l-2^j  +  ^.l=!0,  i  €  1* 

2  2 

The  equations  (7)  and  the  boundary  conditions  lead  to 
determined  system  of  algebraic  equations  for  die  values  of  +• 
These  equations  lead  to  a  tridiagonal  system  of  equations 
which  is,  therefore,  solved  by  Thomas  algorithm. 

The  extension  to  two  dimensional  scheme  may  be  easily 
done.  Two  dimensional  stencil  is  shown  in  Fig.  2. 


K - 

Ax 


where  Fy  and  are  standard  finite  difference  expressions 

for  uxx  and  Uyy  respectively  and  x  s  At/2.  Using  the 

standard  finite  difference  scheme  we  get  the  following 
equations 

n+-L  «+•!- 

W  -U.  2.  +  (a  +  2)  U  2  -  U  2  K 
»*1.J  i.j  i+l.j 

ui;j.1+(a-2)u|;j-uhj  +  J 
2 

i,j  =  1,3 . 2M  - 1,  a  =  4 ( -x)-,  Ax  =  Ay 

At 

0°)  -uf|.11+(a  +  2)uh|1-uPjti1  = 

n+-L  ,  n  +  J-  r+-L 

u  ,2.  +  (a  -  2)  U.  .  2  -  U.  2 

l-l.J  1.J  i+».j 

ij  =  13 . 2M-  1 

and  the  continuity  condition  are 


i,j  =  2, 4, ....  2  M  -  2 

00a)  -uJJ.1,  +2uJJ*1-u5Jti1  =0 

ij  =  2,4 . 2M-2 


The  two  algebraic  systems  (9),  (9a),  and  (10),  (10a)  can 
be  solved  by  the  tridiagonal  algorithm. 


2.  Numerical  experiments 


Fig.  2. 

Consider  the  two  dimensional  diffusion  equation  u,  * 

+  u^.  The  compact  ADI  method  is  given  by  equations  (8) 
-  (10): 


One  of  the  primary  objectives  of  this  project  is  to 
vectorization  of  1-D  and  2-D  diffusion  problems.  We 
vectorize  the  2-D  heat  equation  for  the  Cray  Y-MP  using 
red-black  SOR  algorithm. 

We  consider  one  analytical  solutions  for  the  following 
diffusion  equation  and  compare  them  with  respective 
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numerical  solutions.  The  results  are  recorded  in  tables  1 . 

The  two  dimensional  diffusion  equation  is  ut  =  uxx  + 

+  f(x,y,t),  0  <  x<  1,0  <y  <1,  t  >  0.  The  following  analytical 
solutions  are  considered: 

u(x.y.t)  *  100x(l  -  xlyd-yle'1,  and  f(x,y,t)  =  200(y-y2  + 
x-x2)-  100(x-x2)(y-y2e't 

All  the  problems  mentioned  above  are  associated  with 
Dirichlet  boundary  conditions. 

The  CPT-ADI  is  compared  with  FDM-ADI  and 
FDM-SOR.  The  functional  errors  in  the  method  is  0(h2)  as 
expected  and  derivative  error  is  of  0(h).  The  result  is  shown 
in  table  1. 


FDM-ADI  At  =  Ax2 


No.  of 
Hx-hy  celli 

No.  of 
iters 

CPU  Time 
(see) 

Max  Fct 
Error 

Max  Derivative 
Error 

16 

128 

0.76 

4A924E-03 

1.01704E-02 

32 

512 

6.15 

1.1469E-03 

00 

00 

«n 

«n 

64 

2048 

49.48 

2.8667E-04 

2.66432E-02 

3.  Concluding  remarks 

The  x.  y  domains  are  devided  into  8.  16,  32  and  64  cells 
and  in  each  case,  the  starting  time  is  taken  to  be  0  and  the 
final  lime  is  1.  The  relation  between  time  step  and  space 
step  varies  among  FDM-ADI,  FDM-SOR  and  CPT-ADI 
schemes.  We  assume  all  the  material  constants  are  equal  to 
unity  and  Ax  =  Ay.  In  case  of  FDM-SOR  and  CPT-ADI.  Ax 

=  Ay  =  At  and  however  in  case  of  FDM-ADI,  At  =  (Ax)2. 
This  is  the  disadvantage  for  FDM-ADI.  For  example,  when 
Ax  =  13625E-2,  At  »  4.883E-4  and  it  would  take  2048 
iterations  to  reach  time  1.  However,  CPT-ADI  and 
FDM-SOR  are  free  from  this  difficulty.  But  SOR  also  has  a 
different  disadvantage.  Each  time  step,  FDM-SOR  takes 
large  number  of  iterations  to  converge  to  a  preassigned 
level.  With  64  cells  CPT-ADI  requires  only  64  iterations  to 
reduce  the  error  level  to  334792E-03,  where  as  FDM-ADI 
requires  2048  iterations  to  reduce  the  error  level  to 
2.8667  IE-04  and  FDM-SOR  requires  3392  iterations  to 
reduce  the  error  level  to  937374E-04. 

As  CPT  algorithm  requires  more  points  evaluation  per 
iteration,  it  is  obvious  that  CPT  requires  more  CPU  time 
than  FDM-SOR  or  FDM-ADI. 

Acknowledgments:  Thanks  to  M.  E.  Rose  for  his  many 
valuable  suggestions  for  the  completion  of  this  paper.  Also 
many  thanks  go  to  A.  Nachman  of  AFOSR. 


CPT-ADI  At  =  Ax 
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FDM-SOR  At  =  Ax 


No.  of 
Hx=hy  celli 

No.  of 
iters 

CPU  Time 
(sec) 

Max  Fct 
Error 

Max  Derivative 
Error 

16 

240 
us  1.65 

2.10 

3A050E-03 

5.869E-01 

32 

960 

1.80 

36.73 

2.155E-03 

2.940E-01 

64 

3392 

<0-1.85 

473.18 

93737E-04 

1.469E-01 

78 


ATTACHMENT  #6  A  Comparative  Study  of  Compact  Finite 
Volume  Methods  for  3-D  Diffusion  Equation  with  Finite 
Volume  ADI  and  SOR 


A  Comparative  Study  of  Compact  Finite  Volume  Methods  For  the  3-D 
Diffusion  Equations  with  Finite  Difference  ADI  and  SOR  *. 

B.  N.Borah1,  R.E. White*,  A.Kyrillidis?,  S.Shankariingam  i,  Y.JM 


University  of  Alabama 
Tuscaloosa,  Alabama 

Sponsored  by  the 
IEEE  Computer  Society 

lecnmcoi  vonwranoo  on  WMiipuitt  wiopnm 
in  cooperation  wtth 
ACM/SIGGRAPH 


A  Comparative  Study  of  Compact  Finite  Volume  Methods  For  the  3-D 
Diffusion  Equations  with  Finite  Difference  ADI  and  SOR  *. 

B.  N. Borah1,  R.E. White*,  A.Kyrillidisa,  S.Shankarlingam  i,  Y.Ji1 


1.  N.C.  A  &  T  State  University,  Greensboro,  NC 

2.  N.C.  State  University,  Raleigh,  NC 


Abstract 

A  recently  developed  Compact  Finite 
Difference  Scheme  is  applied  to  3-D 
diffusion  equations.  The  relative  merits  of 
CPT-ADI  is  compared  with  two  other 
computational  schemes  such  as  Finite 
Difference  SOR  and  Finite  Difference  ADI. 
The  numerical  results  obtained  from  these 
three  schemes  are  compared  to  known 
analytical  solutions.  The  primary  interest 
of  this  paper  lies  in  vectorization  and 
parallel  processing.  CPT-ADI  is  the  fastest 
algorithm  than  both  FDM-ADI  and  FDM 
- SOR  as  is  evident  from  the  CPU  times. 


1.  Introduction 

We  shall  briefly  describe  the  Compact 
Finite  difference  scheme  for  a  1-D  steady 
state  problem.  The  extension  to  a  3-D  case 
maybe  easily  done.  The  idea  behind 
this  approach  is  to  solve  tlje  differential 
equation  in  isolations  from  its  neighboring 
subintervals  (i.e.  compactly)  and  then 
extend  the  solution  in  die  large  by  means  of 
the  continuity  conditions  for  the  flux  and 
the  temperature  across  the  boundaries  of 
the  contiguous  subintervals  (See  Rose[l]). 

We  consider  here  1-D  Steady  diffusion 
prc  blem : 

(1)  D.E.  u'  =  f 

(2)  u 

(3)  B.C.  <t»*g 


for  X  e  I  and  I  ■  [I U] . 

Divide  the  interval  I  into  m  nonoverlapping 
subintervals: 

Ij  =  (X  /  Xj.i/2  S  X  £  Xj+i/2} 
with  center  points 

and  interior  endpoints 

Xj ,  j=l,2, — ,M-1 

We  shall  adopt  finite  difference  notations 
Axj  *  xi+J-Xi^,  h}  =  ^ 


and  u(i)  =  u* .  We  also  denote  : 

Ic  =  (Center  Fo  ints) 

Ie  =  { 1.2.3 _ M-l)  (inte  rio  r  endpoints) 


Fig.  1 

The  discrete  equations  for  (1)  and  (2)  can  be 
written  as  ,  for  j  e  Ie . 

(4)  =  Axjfj  '' 

(5  a)  =hjUj.^ 
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and 


Consider  the  3-D  diffusion  equation  ut  =  uw . 
Uyy  +  uzz.  The  Compact  ADI  method  is  eiven 
by  equations  (9)  -  (12)  : 


(5  b)  <Pj -(Pj_4  =hjuJ_4 

Next ,  it  is  required  that  both  u  and  f  be 
continuous  across  every  endpoint  common  to 
two  intervals : 

Mi  =  Mi  =  0  forieIc 

Using  this  continuity  condition ,  [u]4=  0,  in 
terms  of  9  we  get 

91-9*4  9i+^-Q>i 

hi-^  hi +£ 

and  in  case  of  equal  intervals,  this  reduces  to 

JA  t  l  m  1 


(6)  <Pi  = 


■*±4 


.  iele  (endpoints) 


Now,  from  (4)  and  (5),  we  get 

C7a)  9j4-29j+9j_^  =  2h]fj.  ielc 
and  from  (6)  we  have 

(7b)  9j4-29j+9j_£=0.  ielc 

The  equations  (7)  and  the  boundary 
conditions  leads  to  a  determined  system  of 
algebraic  systems  for  die  values  of  9,  which 
can  be  solved  by  Thomas  Algorithm. 

The  extension  to  a  three  dimmsirmai 
scheme  may  be  done  easily . 

Three  Dimensional  stencil  is  shown  in 
Fig.  2. 


(8a)  a 

(8b)  -,-ta  -1't  -  OKft  *  2H"it 

uf*f  3 

(8c)  ,1-t  =  G"tl  +  2H"4). 

1  *  J  •*  1  «  J  •* 

Uf+f  k  ~  uHl  «,  3 

(8d)  -!•-  -  C$  +  21 

where  Fij,k,  Q.j,*  and  Hjj,*  are  finite 
difference  approximation  to  u,*  ,uyy  and  u„ 

respectively  and  O  *  ^ .  Using  the  standard 
finite  difference  scheme,  we  get  the  folllowing 
equations :  0 

m  -  vsi.k+<>**y>»&-  vci* 

=  2^ur+lj.k+(1^Kj,k  +  *x<ij.k 
tjJt^l«3,  — 31-1,  s  Xy=  4.0,  Ax=Ay=Az. 

(10) 

=  Vyt  ,t*  °*ay)  Uyi+V “yt.k 

ijJt=U . 31*1,  Xz  =  Xy=  4.0,  Ax=Ay=Az 

(u) 

ijjc=l,3 . 31-1,  \z  =  Xy=  4.0,  Ax=Ay=Az 


Fig.  2. 
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( 12)  -:>.x  uR-,1, .  t*<  l*  4kx)  u»*k  -  2>.x  u£*i‘j .  k 

^(i-auul-i +*v»r*lk 

>  i.j  +1 . k  7  i.j,k  .v  i,j-l.k 
i,jje=1.3 . .2M-1,  Xz  =  Xy  =  4.0.  Ax=Ay=A z 

and  the  continuity  conditions  are 

(11a)  -uPTJ.  +  2un+J-uD4  .  =0 
m+l.k  i,j.k  W-l.k 

(12a)  -uOtfj  , k+  2u5j - u^i j, k  =0 

ijjF=2.4 . JM,  Xx  =  Xy=  4.0,  Ax=Ay=Az 

The  four  algebraic  systems  (9),(9a).(10),(10a), 
(ll).(lla)  and  (12)  &  (12a)  can  be  solved  by 
the  tridiagonal  algorithm. 

The  Finite  Difference  ADI  euations  for  3-D 
diffusion  equation  are  given  below: 

-U?:  v  i 

(13)  Ui*a  ^  -u £$  =u^+i4 

-uHL  2  .  1 

(14)  . =  + 


j,  j 


Here  a  = -4^ 


A-u^l-u^  +  U^I 


Using  the  standard  Finite  Difference  scheme 
we  get  the  following  equations  : 

(16)  + 
Mf.J  ,  k-1  "Kl- j  k 


-vCjl  i.k  *  <1+2?-y>“!!?k-  i.k 

(17)  -M,lj.k^-rt.j.k  +  ^.W 

^“^.k-i-Ki-^-^ntk 

-Mkj'k.i  +  (l+2^u«‘.k  ■ -  XXj‘k-i 

(is)  -MTa.j.k^CS.j.k+TOi.k 

+v”i,.k+<i-2xk-2v^k 

Finally ,  the  Finite  Difference  SOR  scheme  for 
the  3-D  Diffusion  Equation  is  given  below  : 

(l+6^)Utemp  =  u»  j  ik  +  Atff^j  ,k  + 

^W-i.k-Hiriii.k  +  ^i.j.k 

+  u?ljTl.k  +  u^k.1  +  u?|j,k+1) 

“ffjlk  =  tif)  j  >k  +  coUtemp,  1  £  0)  £  2, 

Here  X  =  4.0  and  m  is  the  iteration  index. 


2.  Numerical  Experiments 

One  of  the  primary  objectives  of  this  project  is 
to  vectorization  of  the  3-D  problems.  We  use 
red-black  SOR  algorithm  to  vectorize  the  3-D 
heat  equation  for  the  Cray  Y-MP. 

We  consider  one  analytical  solutions  for  foe 
following  diffusion  equation  and  compare 
them  with  respective  numerical  solution.  The 
results  are  recorded  in  Table  1. 

The  3-D  diffusion  equation  is  u<  =uxx+uyy 
+un+  f(x,y,z,t),  0<x<l,0<y<l,0<z<l, 
t>0. 

The  following  analytical  solutions  is 
considered: 

u(x,yAt)  =  100x(  1  -x)y(  1  -y)z(  1  -z)e-‘  and 
f(x,yAt)  =(200((y-y2)(z-z2)+(y-y2)(x-x2)+ 
(z-z2)(x-x2»  -  100x(l-x)y(l-y)z(l-z))e-t 

All  foe  problems  mentioned  above 
are  associated  with  Dirichlet  boundary 
conditions. 
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MaxError 


The  CPT-ADI  is  compared  with  FDM-ADI 
and  FDM-SOR  .  The  Functional  errors  in 
the  method  is  0(h2)  and  derivative  error 
is  0(h).  The  result  is  shown  in  Table  1. 


FDM-ADI 


FDM-SOR 


The  curve  (1)  is  for  the  CPT-ADI 
scheme,  curve  (2)  is  for  the  FDM-ADI 
scheme  while  curve(3)  is  for  the  FDM 
-SOR  scheme. 


MaxDerError 


#  of  Cells 


curve  (1)  above  shows  the  graph  of  the 
maxder  error  Vs.  the  number  of  cells  for 
the  CPT-ADI  scheme.  Curve  (2)  shows  the 
graph  of  maxder  error  Vs.number  of  cells 
for  the  FDM-SOR  scheme  and  curve(3) 
shows  the  graph  of  maxder  error  Vs.  die 
number  of  cells  for  the  FDM-ADI  scheme. 


Table  1 
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Cpu  Time 


Figure  3  _ 

curve  (1)  is  for  the  CPT  -  ADI  scheme, 
curve  (2)  is  for  the  FDM-SOR  scheme  and 
curve  (3)  is  for  the  FDM-ADI  scheme. 

3.  Concluding  Remarks 
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The  x,  y,  and  z  domain  are  divided  into  4, 
8 , 16  cells  and  in  each  case,  the  initial  time 
is  taken  to  be  0  and  the  final  time  is  1.  We 
assume  all  the  material  constants  to  be  equal 
to  unity  and  Ax  =  Ay  =  Az.  In  case  of  FDM- 
SOR  and  CPT-ADI,  Ax  =  Ay  =  Az  =At  and 
however  in  case  of  FDM-ADI ,  At  =  (Ax)2. 
This  is  the  disadvantage  of  FDM-ADI. 
However,  CPT-ADI  and  FDM-SOR  are  free 
from  this  difficulty.  But  SOR  has  a  different 
disadvantage,  for  each  timestep,  FDM-SOR 
takes  large  number  of  iterations  to  converge 
to  a  preassigned  level  With  16  cells ,  CPT- 
ADI  requires  16  iterations  to  reduce  die 
error  level  to  4.59  E  -3  .  whereas  FDM-ADI 
requires  256  iterations  to  reduce  tire  error 
level  to  6.0  E -4,  whereas  FDM-SOR 
requires  304  iterations  to  reduce  the  error 
level  to  2.4872  E-4. 

In  comparing  CPU  times  among  all  the 
three  approaches,  it  is  found  that  CPU  time 
for  FDM-ADI  is  the  worst  For  the 
problem  with  16  cells ,  CPT-ADI  takes 
only 2.346  sec.  However,  FDM-ADI 
approach  takes  10.72  secs  for  the  same 
problem. 

In  terms  of  derivative  emor,  CPT-ADI 
is  the  best,  it  is  close  to  the  order  of  Ofh2). 
The  error  analysis  for  the  three  approaches 
are  shown  in  Figure  3. 
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